Project taskstask 1 analytic objectiveindividual groups are, Ask an Expert

Programming

Project Tasks

Task 1: Analytic Objective

Individual groups are expected to come up with an analytic objective for which they are to utilize the knowledge and application of pattern discovery and predictive modelling using the SAS enterprise mining software. A well drafted business case will help you understand your data set; identify variable roles and measurement levels and ultimately your choice or method for doing your analytics.

An example of your analytic objective could take this form:

"A radio station wants to analyze the use of Web services such as simulcasts, podcasts, news streams, music streams, archives, and live Web music to see whether any unusual patterns exist in the combinations of services selected by its Web users. In this case study, you perform an association analysis"

Note: Individual groups are encouraged to come up with different Analytic objectives. No two (2) groups should have the same. Each group should attempt pattern discovery and predictive modelling using the assigned data set for this exercise.

Task 2: Data Analysis and Definition

Prepare in tabulated form the data dictionary which defines the variables as they appear in your data set as well as the model roles and Measurement levels. An example can be seen below.

Name

Model

Role

Measurement

Level

Description

STOREID

Nominal

Identification number of the store

Tip 1: Execute the following steps in SAS Enterprise Miner

(i). Create a project with your group and group number as its name.

(ii). Create a library.

(iii). Create a data source by defining the data set (the one assigned to you) as a data source.

(iv). Determine whether the variable roles and measurement levels assigned to the variables are appropriate. The variable roles and measurement levels should match with the values in the data definition table above. Examine the distribution of the variables

2.1 Answer the following Questions.

1. Are there any unusual data values in any of your assigned input variables? Support your answer with appropriate argument.

2. List two possible strategies to handle cases with unusual values before attaching your desired analysis node? Explain the possible scenarios in which those strategies are appropriate.

3. Are there missing values in any of the input variables?

4. If you assigned a variable a rejected role, why is this case?

Task 3: Cluster and Association analysis

For groups requiring running Cluster or Association Analysis the following tips should help you and the questions should be responded to.

Tip 2: Execute the following steps in SAS Enterprise Miner

(v). Add your data source to the diagram workspace.

(vi). Add a Cluster node to the diagram workspace and connect it to the data source node.

(vii). Select the Cluster node and select Internal Standardization - Standardization.

(viii). Specify a maximum of six clusters and run the diagram from the Cluster node.

(ix). Add a Segment Profile node to the diagram workspace and connect it to the Cluster node.

(x). Run the diagram from the Segment Profile node.

3.1 Answer the following Questions.

5. What would happen if you did not standardize your inputs?

6. Using the results of the Segment Profile node, interpret the characteristics of the first three biggest clusters.

7. Why was cluster analysis chosen?

Tip 3: Execute the following steps in SAS Enterprise Miner

(i). Create a new diagram and Name the diagram (Name of your dataset).

(ii). Create a new data source using the data set.

(iii). Assign the variable roles to the variable.

(iv). Add the node for the data set and an Association node to the diagram.

(v). Change the setting for Export Rule by ID to Yes.

(vi). Leave the remaining default settings for the Association node and run the analysis.

3.2 Answer the following Questions.

1. What is the highest lift value for the resulting rules?

2. Which rule has this value?

3. Why was an Association Analysis run?

Task 4: Predictive Modeling

For groups requiring running their analysis with decision trees, regression and neural networks the following tips should help you and the questions should be responded to

Tip 4: Decision trees - Execute the following steps in SAS Enterprise Miner

(i). Create a new diagram named Predictive Analysis in your project

(ii). Define the data set as a data source for the project. Set the roles for the analysis variables as shown above.

(iii). Add the data set to the diagram workspace.

(iv). Add a Data Partition node to the diagram and connect it to the Data Source node. Assign 50% of the data for training and 50% for validation.

(v). Add a Decision Tree node to the workspace and connect it to the Data Partition node.

(vi). Create a decision tree model autonomously using average squared error as the model assessment statistic.

(vii). Add a second Decision Tree node to the diagram and connect it to the Data Partition node.

(viii). In the Properties panel of the new Decision Tree node, change the maximum number of branches from a node to 3 to allow for three-way splits.

(ix). Create a second decision tree model autonomously using average squared error as the model assessment statistic.

4.1 Answer the following Questions.

1. Why was the Target Variable assigned that variable role?

3. How many leaves are there in the optimal tree created in step (vi)? Which variable was used for the first split and explain why this variable was chosen over others?

4. How many leaves are there in the optimal tree created in step (ix)?

5. Which of the decision tree models appears to be better

a. based on average squared error on training data?

b. based on average squared error on validation data?

Tip 5: Regression - Execute the following steps in SAS Enterprise Miner

(x). Attach the StatExplore tool to the data source and run it. View the results of the StatExplore tool and determine if any of the variables have missing values.

(xi). Add an Impute node to the diagram and connect it to the Data Partition node. Set the node to impute U for unknown class variable values and the overall mean for unknown interval variable values. Create imputation indicators for all imputed inputs.

(xii). Add a Regression node to the diagram and connect it to the Impute node. Choose the stepwise selection and average squared error as the selection criterion. Run the Regression node and view the results.

(xiii). Disconnect the Impute node from the Data Partition node. Add a Transform Variables node to the diagram and connect it to the Data Partition node. Connect the Transform Variables node to the Impute node.

(xiv). Apply a log transformation to the DemAffl and PromTime inputs and Run the Transform Variables node.

(xv). Rerun the Regression node.

4.2 Answer the following Questions.

6. In preparation for regression, is any missing values imputation needed? If yes, should you do this imputation before generating the decision tree models? Why or why not?

7. Which variables are included in the final regression model generated in step (xii)? List the variables in the descending order of importance to the model.

8. Which variables are included in the final regression model generated in the last step?

9. Based on average squared error on the validation data, which of the two regression models generated appear to be better?

Tip 6 : Neural Networks - Execute the following steps in SAS Enterprise Miner

(xvi). Add a Neural Network tool to the diagram. Connect the Impute node to the Neural Network node.

(xvii). Set the model selection criterion to average squared error. Run the Neural Network node.

4.3 Answer the following Questions.

10. How many weights does the neural network model generated in step (xvii) include?

11. Examine the validation average squared error of the neural network model. How does it compare to the two decision tree models and the regression model generated after applying log transformation?

Task 5: Compare your models

Execute the following steps in SAS Enterprise Miner

(xviii). Add a Model Comparison node to the diagram. Connect it to all the predictive models generated in the earlier steps.

(xix). Run the Model Comparison node.

4.4 Answer the following Questions.

12. Examine the results of the Model Comparison node. Of the predictive models compared which model has been selected by the Model Comparison node? Based on what selection criteria this model has been selected?

13. Change the default values of the Model Comparison node properties so that it selects the model having the least average squared error on the validation data. Run the Model Comparison node again. Which model has been selected now?

14. Why are the models compared

Task 6: Business Implication

1. From the outcome of your analysis of the data set and the business case you have come up with, what can you deduce, recommend and conclude.

2. What is the business implications that can be drawn from the process of building and comparing these models, and has this practice helped resolve the business issue? Why or why not?

View complete question

DBMS, Programming

Category:- DBMS
Reference No.:- M9741920

Have any Question?Write your Review or question?

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Recent Questions

Ask DBMS Expert

Programming

Related Questions in DBMS

Data mining assignment -in this assignment you are asked to

Sql query assignment -for this assignment you are to write

The groceries datasetimagine 10000 receipts sitting on your

You are in a real estate business renting apartments to

Objectivethe objective of this lab is to be familiar with a

The relation memberstudentid organizationid roleid stores

Relational database exerciseyou have been assigned to a new

Relational database design a given the following business

We can represent a data set as a collection of object nodes

Data model development and implementationpurpose of the

Ask Experts for help!!

Looking for Assignment Help?

Why might a bank avoid the use of interest rate swaps even

Describe the difference between zero coupon bonds and

Compute the present value of an annuity of 880 per year

Compute the present value of an 1150 payment made in ten

Compute the present value of an annuity of 699 per year

Follow Us