Ask Homework Help/Study Tips Expert

Case Study Part 2 Assignment

The assignment for part 2 of the case study extends your learning experience from units 6 and 7 with new Pentaho features and transformation concepts. You will use most of the Pentaho steps and transformation concepts from assignment 4. You will use new steps for adding constants and updating tables. You will extend transformation processing to capture rejected change data in an error log (Table Output step), use a one-sided outer join to delay rejection of non-matching rows, use the Oracle pseudo column RowNum in SELECT statements, and use output steps (Insert/Update and Update) to check referential integrity rather than a merge join step. In addition, you will work with a larger amount of change data.

As preparation for this assignment, you should review the guided tutorial about Pentaho Data Integration for Oracle and your graded assignment in unit 7. You should also read the case study documents about data integration background (unit 13) and data warehouse design (unit 12). Since this assignment only uses a subset of the data warehouse, you do not need to have detailed understanding of the unused parts of the data warehouse design.

Since the requirements for this assignment are detailed, you should plan to read the assignment and related background documents carefully. You may want to develop the transformation incrementally for each major requirement. You should use the test data warehouse tables before you apply your transformation to the production data warehouse tables.

Validation Requirements
Your Pentaho transformation should perform five types of validation. Change rows should be rejected for each violation of a validation rule.
- Null values: reject any column except the primary key (Change_Id) for null values. You do not need to perform a null value check on the primary key column.

- Foreign key references: reject invalid foreign key references for location (Location_Id), customer (Customer_Id_Ordered_By), sales agent (Sales_Agent_Id), and sales class (Sales_Class_Id).

- Date references: reject invalid dates in date columns (Contract_Date, Date_Promised, Date_Shipped_By). The dates are stored as text in the job change data table with a format of yyyy-mm-dd. The combination of month, day, and year should be a valid date that exists in the W_Time_D table. Note that invalid date and foreign key references will be captured in the same error log step as explained in the section about using the Insert/Update step.

- Business day differences: reject a job change row if differences between dates do not satisfy difference constraints. Since this constraint involves differences in business days, the difference is computed by subtracting row numbers associated with Time_Id values in the W_Time_D table. Note that you cannot just subtract the Time_Id values because of the business day requirements. The Additional Tips section provides background about retrieving row numbers.

o The row number of the Date_Promised must be between 14 and 30 days after the row number of the Contract_Date.
o The row number of the Date_Ship_By must 2 to 7 days before the row number of the Date_Promised.
o You will need a Calculator step to compute the difference in days after retrieving row number values associated with each date column in the job change data table.
o Note that the time dimension table contains a custom calendar with only working days, not weekends. Holidays, if falling on week days, are included in the time dimension table, however. Note, that you cannot use a date difference function to calculate the difference in days because the constraint involves differences in work days, not just the difference in days.

- Lead validation: The Lead_Id in the job change data table must exist in the W_Lead_F table. In addition, the Created_Date in the W_Lead_F table should be chronologically before the Contract_Date in the job change data table. This validation checking occurs after inserting a row into the W_Job_F table. The other validation rules are checked either before or during the insertion into the W_Job_F table.

Inserting into the Job Fact Table and Updating the Lead Fact Table

You should use the Insert/Update step to insert a row into the W_Job_F table. You should use an Oracle sequence object to generate the primary key of the W_Job_F table using an Add Sequence step before the Insert/Update step. An item in the Additional Tips section explains the usage of the Insert/Update step.

You should use the Insert/Update step to check foreign key errors for location (Location_Id), customer (Customer_Id_Ordered_By), sales agent (Sales_Agent_Id), and sales class (Sales_Class_Id), as well as the Time_Id values associated with the date columns. Since the merge join step does not provide error flows, you cannot use the merge join step to check foreign key errors. The merge join step will reject non-matching rows, but it does not allow non matching rows to be sent to another step for error processing.

To update the W_Lead_F table, you should use the Update step, not the Insert/Update step. Although you did not use the Update step in unit 6, it is like the Insert/Update step. You should update the W_Lead_F table after inserting a row into the W_Job_F table. If the validation in the Update step succeeds, you need to update the Job_Id and Success columns of the W_Lead_F table. The Job_Id should be assigned the generated Job_Id in the stream. The Success column should be set to "Y".

Rejected rows will not update the W_Lead_F table. However, after inserting a row into the W_Job_F table, a validation error may occur when updating the W_Lead_F table. There is no way to avoid this anomalous situation in which a row is inserted into the job fact table, but the associated lead fact table row is not updated.

Error Log Insertion

If a row in the job change data table fails a validation rule, a row must be inserted into the error log table. For all validation errors, the change row should contain the generated Log_Id, Change_Id of the rejected row, and an error message (Note column) indicating the reason for rejection. For errors related to updating the W_Lead_F table, the error log should contain the Job_Id of the related W_Job_F row. The Log_Id value should be generated by an Oracle sequence and available in the stream. Table 1 summarizes error handling using the error log.

Table 1: Error Handling using the W_Error_Log Table

Type of Error

Columns

Comments

Null value errors

Change_Id, Note

Log_Id generated using a sequence. Job_Id is null in the error log row.

Date difference errors

Change_Id, Note

Log_Id generated using a sequence. Job_Id is null in the error log row.

Foreign key errors including date FKs

Change_Id, Note

Log_Id generated using a sequence. Job_Id is null in the error log row.

Lead update errors

Change_Id, Job_Id, Note

Log_Id generated using a sequence. Job_Id should come from the stream.

You send a rejected row to another step in Filter Rows, Insert/Update, and Update steps. The Merge Join step does not have a feature for rejecting rows. In a Filter Rows step, you should send false data to another step to insert rows into the W_Error_Log table. In Insert/Update and Update steps, you specify error handling using the Define Error Handling item (right click on the step and select this item near the bottom of the item list).

To facilitate error messages in the error log table, you should create an "Add constants" step in the initial steps of the transformation. You need to assign values to constants for error messages and the Success column in the W_Lead_F table. You need to specify the data type for each constant field in the step. You should create separate fields in the "Add constants" step for each type of validation error: invalid foreign key errors, invalid date differences, null value errors, and lead table errors.

Case Study Part 3 Assignment

The assignment for part 3 of the case study extends your learning experience from units 8 and 9. You will be challenged to develop SQL statements for major business intelligence needs. Some of the statements will be more complex than you encountered in unit 9 due to the larger schema size and relationships among fact tables.

As preparation for this assignment, you should review material about the GROUP BY clause as well as analytic functions. You need to sharpen skills for query formulation involving row summaries as a foundation for this assignment. You also should review concepts and examples involving analytical functions in unit 9.

For this assignment, you should understand business intelligence needs and relate the needs to the data warehouse tables. For business intelligence needs, you should read the associated document providing background about analytical query formulation. After you are clear about the database representation for a problem, you should then begin writing SQL statements. To help structure your query formulations, you should create basic statements without the analytic function elements. After your basic statements execute correctly, you can then revise them for analytic function elements.

Base Queries
The base queries involve two broad areas, revenues/costs and quality control. To facilitate reuse of base queries with analytic function requirements, CREATE VIEW statements are required for some of the base queries.

To calculate the difference in business days in the base queries for contractual delays (BQ5 and BQ6), you should use the getBusDaysDiff function as shown in Appendix A. You can use this function to calculate the difference in business days between two-time identifier values. After compiling this function, you can use it in a SELECT statement just like you use Oracle built-in functions. Without this function, calculating the difference in business days involves much more complex SQL coding.

To assist with formulation of analytic queries, you should create views (not materialized views) for the queries involving revenue summary, cost summary, date promised delays, and shipped by date delays. You can use these views just like base tables in queries for analytic functions.

Queries involving Analytic Functions

The base queries can be used in many queries involving analytic functions. This assignment involves a small subset of possible analytic queries using the base queries. In some analytic queries, you should extend base queries with analytic functions. In other analytic queries, you should use a view in the FROM clause instead of directly extending base queries.

Analytic queries involving customer revenue trends

You should write two analytic queries extending the base query for job revenue trends as summarized in Table 2. Both analytic queries involve window comparisons.

Attachment:- Case Study.zip

Homework Help/Study Tips, Others

  • Category:- Homework Help/Study Tips
  • Reference No.:- M92751073
  • Price:- $90

Guranteed 48 Hours Delivery, In Price:- $90

Have any Question?


Related Questions in Homework Help/Study Tips

Review the website airmail service from the smithsonian

Review the website Airmail Service from the Smithsonian National Postal Museum that is dedicated to the history of the U.S. Air Mail Service. Go to the Airmail in America link and explore the additional tabs along the le ...

Read the article frank whittle and the race for the jet

Read the article Frank Whittle and the Race for the Jet from "Historynet" describing the historical influences of Sir Frank Whittle and his early work contributions to jet engine technologies. Prepare a presentation high ...

Overviewnow that we have had an introduction to the context

Overview Now that we have had an introduction to the context of Jesus' life and an overview of the Biblical gospels, we are now ready to take a look at the earliest gospel written about Jesus - the Gospel of Mark. In thi ...

Fitness projectstudents will design and implement a six

Fitness Project Students will design and implement a six week long fitness program for a family member, friend or co-worker. The fitness program will be based on concepts discussed in class. Students will provide justifi ...

Read grand canyon collision - the greatest commercial air

Read Grand Canyon Collision - The greatest commercial air tragedy of its day! from doney, which details the circumstances surrounding one of the most prolific aircraft accidents of all time-the June 1956 mid-air collisio ...

Qestion anti-trustprior to completing the assignment

Question: Anti-Trust Prior to completing the assignment, review Chapter 4 of your course text. You are a manager with 5 years of experience and need to write a report for senior management on how your firm can avoid the ...

Question how has the patient and affordable care act of

Question: How has the Patient and Affordable Care Act of 2010 (the "Health Care Reform Act") reshaped financial arrangements between hospitals, physicians, and other providers with Medicare making a single payment for al ...

Plate tectonicsthe learning objectives for chapter 2 and

Plate Tectonics The Learning Objectives for Chapter 2 and this web quest is to learn about and become familiar with: Plate Boundary Types Plate Boundary Interactions Plate Tectonic Map of the World Past Plate Movement an ...

Question critical case for billing amp codingcomplete the

Question: Critical Case for Billing & Coding Complete the Critical Case for Billing & Coding simulation within the LearnScape platform. You will need to create a single Microsoft Word file and save it to your computer. A ...

Review the cba provided in the resources section between

Review the CBA provided in the resources section between the Trustees of Columbia University and Local 2110 International Union of Technical, Office, and Professional Workers. Describe how this is similar to a "contract" ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As