Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask DBMS Expert


Home >> DBMS

Question 1:

LBS is a management investment firmmanaging about $600 million in assets, primarily in stocks and mutual funds, for both institutional and individual investors. It believes that conventional approaches to money management are having an increasingly difficult time meeting or exceed-ing benchmarks. Further, it believes that the new generation of data-mining techniques can capture significant non-linear causal relationships for use in forecasting when market and security price behavior is dominated by non-linearity.

LBS wants to maximize the return on the assets it invests for its clients while minimizing their risk exposure. For LBS, it is not enough just to know which securities to purchase. In order to be successful, the asset management firm must also know when to buy and sell the securities. The firm feels that it can do this through a combination of high-quality analytic tools, highly efficient computer engi¬neering, and market-savvy analysts.

The problem of developing a system to estimate future prices is daunting because financial processes are generally character¬ized by high levels of non-linearity and complexity. The amount of data available to an analyst is overwhelming. Also, financial mar¬kets are constantly evolving so models must adapt to these changes. So,

• The system needs to be able to quickly incorporate knowledge about a domain that often defies explicit definition. On a day-to-day basis, random shocks, crowd psy¬chology, and short-lived trends influence financial markets. Also, different experts have widely varying interpretations of the data even after the fact. Even expert traders sometimes have difficulty explaining what general principle led them to make a specific trade.

• The system needs to be able to deal with and analyze complex data. As a result of the interactions among several different market forces, financial markets can exhibit highly non-linear and highly complex behavior.

• The system needs to be able to deal with the large amounts of economic and financial data that are generated daily. It is difficult or impossible even for the most skilled expert to assimilate this amount of data accurately and consistently. In the words of one experi¬enced trader, "Even the smartest of us is not as smart as the market. In order to make sense of the data, we have little choice but to turn to the computer."

• The system needs to be able to adapt quickly over time.A trading strategy that works in a bull market may not fare well in a bear market. Markets evolve and adapt to different forces over time.

The firm has determined that a meaningful horizon is about 4 weeks. It is an "active" manager that seeks to outperform the mar¬ket, as opposed to a "passive" manager that indexes its portfolio with the market and seeks only to match the market's performance.

The system needs to be able to interpret and analyze large amounts of market data and "update its view of the world" frequently and easily, accessing economic and market data from a variety of sources and, using these data, identifying those stocks that are "likely" to be winners, and those that are more "likely" to be losers, over the next 4 weeks. LBS will use simulated trading systems to test the models.

Models will be tested (or validated) by back testing over several historical years to determine how they would have performed. Models that recommend buying stocks in volumes that were not obtainable or conducting so many trades that transaction fees wiped out profits would not be considered successful.

LBS's data is plentiful, although not necessarily clean.The system does not need to make specific point predictions for prices on a specific date but only to provide the decision maker with estimates of a se¬curity's upside and downside potential. On the other hand, since a decision maker (typically a portfolio manager) would be interpreting the results of a prediction, it would be useful if the model could offer some insight into its analysis. It is also im¬portant that the system fits smoothly into LBS's workflow and current modeling tools. To do this, the system must interface smoothly with the financial databases where the market data are stored.

Since LBS wants a 4-week time horizon, the system need not function in real-time. On the other hand, the system must be able to perform the analysis on each individual security in a reasonable amount of time. The system also must be able to be expanded to accommodate additional securities and input factors.

Inaddition, LBS would like to take up as little of the firm's ex¬pert traders' time as possible. Expert time is valuable; each hour away from market analysis or trading can cost real dollars. Furthermore, and more important, LBS has found that it could be somewhat difficult for their expert traders and analysts to artic-ulate their expertise, especially since the rules are complex and continually evolving.

(a) Briefly describe the modeling problem facing LBS, and identify what type of problem it is in terms of the types of data mining problems discussed in session 1 (prediction, estimation, classification, clustering, association, etc.). Justify your answer.

(b) (What data mining model type would you propose for this problem? Justify your answer.

(c) What are two significant limitations of your proposed approach for the given problem?

Question 2: Assume that you have to build an online recommendation system for buying cars. Cars have hundreds of specifications/features. Comment on whether Naïve Bayes, K Nearest Neighbors or Decision Trees would be the best approach for this type of system. Justify your answer.

Question 3: Assume that using scanner data on customer purchases combined with demographic and behavioral data on customers stored in the corporate data warehouse, you would like to build a predictive model that would help classify customers into one of a set of distinct profitability segments (e.g., high, medium and low). Further, assume that although your company operates across the whole Southern US, you would like to focus on customers spending at least $500 per month on average for the past 12 months, at any of 5 stores in Texas. Discuss whether K-means clustering would be useful to identify the relevant customer set. Justify your answer.

Question 4: Which of the following is a symptom of a decision tree that is "over-fitted"? In each case, briefly justify your answer.

(a) The error rate (misclassification) chart for the model is as in the graphs below (for the training and validation sets):

2174_Chart for the model.png

(b) The tree is unbalanced (i.e., some paths from the root to leaf nodes are long while others are short)

(c) The confusion (classification) matricesfor both the training set and thevalidation sethavelarge valuesin the off-diagonal cells(Hint: In a confusion matrix C, cij indicates the number of cases whose actual output value ri was classified as rj by the tree)

(d) The tree has a highoverall mis-classification rate for the training set but not for the validation set.

(e) A number of the leaf nodes have very low support.

Question 5: Given the following data on purchase transactions expressed as itemsets:

1

Bread

Juice

Ketchup

 

2

Milk

Juice

Apples

 

3

Pepper

Apples

Juice

Wine

4

Juice

Ketchup

Wine

Salt

5

Apples

Detergent

Wine

 

6

Juice

Ketchup

Wine

Apples

7

Bread

Milk

Juice

 

8

Detergent

Wine

Apples

 

9

Salt

Wine

 

 

10

Juice

Ketchup

Milk

Apples

11

Bread

Apples

Wine

 

12

Milk

Juice

Detergent

Ketchup

Each row is an itemset (i.e., a collection of items that were bought together).

(a) Identify all the large itemsets with minsup = 0.25 (i.e., 25%). For each large itemset, compute its support as a percentage (%).

(b) Using the results in (a), state one association rule that has a confidence above 80% and acceptable lift. Compute its confidence, support and lift.

(c) If the APriori approach described in class were used to identify association rules for this data set, identify threeitemsetswhose support would not have to becalculatedby the rule mining process (i.e., their support would not have to be computed)? Explain why they would not be considered.

Question 6:

Consider the following dataset about customers of a particular product. The column "Buyer" indicates whether each customer bought the product or not. You have been asked to use Naïve Bayes Classification to identify potential buyers.

Name

Married

Job

Hair

Gender

Buyer

Peter

No

Manager

Short

Male

Yes

Claudia

Yes

Engineer

Long

Female

No

Angela

No

Lawyer

Long

Female

No

Amy

No

Manager

Long

Female

Yes

Albert

Yes

Engineer

Short

Male

Yes

Karin

No

Manager

Long

Female

No

Nina

Yes

Engineer

Short

Female

Yes

Sergio

Yes

Manager

Long

Male

Yes

Would the following person be a buyer or not (show your calculations)?

John

Yes

Engineer

Short

Male

?

Question 7: Assume that you have joined a company that sells disk drives for PCs. It has decided to enter the market for mobile phones starting next year. The CEO has heard that neural nets are powerful tools for building classification and prediction models, and has asked you to build a Neural Network model for classifying mobile phone products proposed by your R&D department into one of the following three market potential categories: Low, Medium, High. You have been given access to detailed data on the company's products and sales for ten of the last eleven years (current year sales have still to be compiled). How would you respond?

Question 8: Your boss has suggested that rather than using a single type of classification model, it might be useful to combine the strengths of different model types. So she has suggested that you initially build a set of neural network models to figure out the key determinants of buying behavior in each segment, and then use these significant variables to build a decision tree model which would provide the key threshold of each variable that influence the important outcomes in future buying behavior. How would you respond?

DBMS, Programming

  • Category:- DBMS
  • Reference No.:- M91694884
  • Price:- $130

Guranteed 48 Hours Delivery, In Price:- $130

Have any Question?


Related Questions in DBMS

Sql transactions exercisesconsider table itemnameprice

SQL Transactions Exercises Consider table Item(name,price) where name is a key, and the following two concurrent transactions. T1: Begin Transaction; Update Item Set price = 2*price Where name = 'pencil'; Insert Into Ite ...

Case study problem 1 the case study company has experienced

Case Study: Problem 1 The case study company has experienced rapid growth in both the size of its client base and also in the services provided to clients. Unfortunately, the growth in data management policies, procedure ...

Databases assignment - monash library services monlib case

Databases Assignment - Monash Library Services (MonLib) Case Study TASK 1: Data Definition For this task you are required to complete the following: 1.1 - Add to your solutions script, the CREATE TABLE and CONSTRAINT def ...

Question sql injection is in the top 10 owasp and common

Question : SQL Injection is in the top 10 OWASP and Common Weakness Enumeration. Using MySQL and PHP, show your own very short and simple application that is vulnerable to this attack. Provide another version that mitiga ...

Question create an erd for the following scenario once you

Question: Create an ERD for the following scenario. Once you submit you will get access to the correct way to create the ERD. Please watch the video and correct any errors in your submission and resubmit. A small company ...

Quesiton 1 what is data-manipulation language dml there are

Quesiton: 1. What is Data-Manipulation Language (DML)? There are four types of access in DML, explain each one. 2. Assume we have a Library Database consists of the following relations: author(author_id, first_name, last ...

Assignment task -write and run sql statements to complete

Assignment Task - Write and run SQL statements to complete the following tasks Part A - DML 1. Show the details of the products where the product code starts with '22'. 2. Display the vendor details from areacode 615. 3. ...

Sqlwrite a select statement that returns three columns from

SQL Write a SELECT statement that returns three columns from the Vendors table: VendorContactFName, VendorContactLName, and VendorName. Sort the result set by last name, then by first name.

Sql query assignment -for this assignment you are to write

SQL Query Assignment - For this assignment you are to write your answers in a word document. This assignment is in three parts: Part A (reporting queries), Part B (query performance), Part C (query design). For this assi ...

Assignmenta restaurant is designing a database to keep

Assignment A restaurant is designing a database to keep track of customer services. A customer is defined as a customer ID, name, address and a telephone number. Customers are served by employees. Each employee is define ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As