Ask Engineering Mathematics Expert

1. This exercise revisits the Hitters data set.

(a) The glmnet() function, by default, internally scales the predictor variables so that they will have standard deviation 1, before solving the ridge regression or lasso problems. This is a result of its default setting standardize=TRUE. Explain why such scaling is appropriate for this application.

(b) Verify that, for a very small value of λ, both the ridge regression and lasso estimates are very close to the least squares estimates. Also verify that, for a very large value of λ, both the ridge regression and lasso estimates approach 0 in all components (except the intercept, which is not penalized by default).

(c) An alternative method for selecting the tuning parameter λ is to use the one-standard-error rule. Under this rule, instead of choosing λ to minimize test MSE, the largest value of λ for which the test MSE is within one standard error of the minimum is chosen. Provide a rationale for the one-standard-error rule.

(d) For each of the ridge regression and lasso models corresponding to the grid of λ values defined in the notes, perform 5-fold cross-validation to determine the best value of λ. Report the results from both the usual minimum MSE rule, and the one-standard-error rule for choosing λ. Note that the cv.glmnet() returns the value of λ selected using the one standard-error rule under the name lambda.1se.

(e) From the last part, you should have computed 4 values of the tuning parameter:

λridgemin , λridge1se , λlassomin , λlasso1se

These are the results of running 5-fold cross-validation on each of the ridge and lasso models, and using the usual rule (min) or the one-standard-error rule (1se) to select λ. Now, using the predict() function, with type="coef", report the coefficient estimates at the appropriate values of λ. That is, you will report two coefficient vectors coming from ridge regression with λ = λridgemin and λ = λridge1se , and likewise for the lasso. How do the coefficient estimates from the usual rule compare to those from the one standard error rule? How do the ridge estimates compare to those from the lasso?

(f) Suppose that you were coaching a young baseball player who wanted to strike it rich in the major leagues. What handful of attributes would you tell this player to focus on?

2. Predic the number of applications received (Apps) using the other variables in the College data set, which is available in the ISLR library.

(a) Use ?College to access information about the data set and answer the following questions. Note that you may also find the summary() function useful.

i. Not including Apps, how many variables are in the data set? In other words, what is p?

ii. Are there any missing values in the data set? If so, remove them.

iii. What is the sample size (once missing values have been removed, if necessary)? In other words, what is N?

iv. Are there any qualitative variables in the data set? If so, list them.

(b) Split the data set into a training set and a test set.

(c) Fit a linear model using least squares on the training set and report the test error obtained.

(d) Fit a ridge regression model on the training set, with λ chosen by cross-validation. Report the test error obtained.

(e) Fit a lasso model on the training set, with λ chosen by cross-validation. Report the test error obtained, along with the number of non-zero coefficient estimates.

(f) Comment on the results obtained. How accurately can we predict the number of college applications received? Is there much difference among the test errors resulting from these three approaches?

Engineering Mathematics, Engineering

  • Category:- Engineering Mathematics
  • Reference No.:- M92255388
  • Price:- $60

Guranteed 36 Hours Delivery, In Price:- $60

Have any Question?


Related Questions in Engineering Mathematics

Q undirected vs directed connectivitya prove that in any

Q: Undirected vs. directed connectivity. (a) Prove that in any connected undirected graph G = (V, E) there is a vertex v ? V whose removal leaves G connected. (Hint: Consider the DFS search tree for G.) (b) Give an examp ...

All these questions should be answered in matlab 1 generate

All these questions should be answered in MATLAB !!! 1. Generate a set of 3 random patterns of dimension 12 where each value is +1 or -1.(3 random 12*12 matrix) 2. Create a 12-unit Hopfield network (a 12x12 matrix) from ...

I have these questions for a homework assignment and have

I have these questions for a homework assignment and have to show work. This works with MIPS coding language and is the class Introduction to Computer Architecture. 1. Find the 2's complement representation (in 32-bit he ...

Question 1 - many spas many componentsconsider 4 types of

Question 1 - Many spas, many components Consider 4 types of spa tub: Aqua-Spa (or FirstSpa, or P1), Hydro-Lux (or SecondSpa, or P2), ThirdSpa (or P3) and FourthSpa (or P4), with the production of products P1, ..., P4 in ...

Analytical methods for engineers assignment - calculusthis

ANALYTICAL METHODS FOR ENGINEERS ASSIGNMENT - CALCULUS This assignment assesses Outcome - Analyse and model engineering situations and solve problems using calculus. Questions - Q1. Differentiate the following functions ...

Clculus assignment -q1 find the total differential of w

CALCULUS ASSIGNMENT - Q1. Find the total differential of w = x 3 yz + xy + z + 3 at (x, y, z) = (1, 2, 3). Q2. Find the value of the double integral ∫∫ R (6x + 2y 2 )dA where R = {(x, y)| - 2 ≤ y ≤ 1, y 2 ≤ x ≤ 2 - y. Q3 ...

Numerical analysis assignment -q1 define the following

Numerical Analysis Assignment - Q1. Define the following terms: (i) Truncation error (ii) Round-off error Q2. Show that if f(x) = logx, then the condition number, c(x) = |1/logx|. Hence show that log x is ill-conditioned ...

Question what is the signed binary sum of 1011100 and

Question : What is the signed binary sum of 1011100 and 1110101 in decimal? Show all of your work. What is the hexadecimal sum of 9A88 and 4AF6 in hexadecimal and decimal? Show all of your work.

Question a signal starts at point x as it travels to point

Question : A signal starts at point X. As it travels to point Y, it loses 8 dB. At point Y, the signal is boosted by 10 bB. As the signal travels to point Z, it loses 7 dB. The dB strength of the signal at point Z is -5 ...

Show all your work not just the answerswhen you multiply 21

(SHOW ALL YOUR WORK, not just the answers) When you multiply: 21 x 68 you most likely do: 8x1 + 8x20 + 60x1 + 60x20 = 1, 428 So, there are 4 multiplications and then 3 additions. How long would it take a computer to do t ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As