Ask DBMS Expert


Home >> DBMS

Module Learning Outcomes

Module Learning Outcomes are the official statement of what you are intended to gain from the module. ("On successful completion of the module the student will be able to ....."). All module specifications carry these statements.

In the table below we list the three Module Learning Outcomes for Data Warehousing. Then in purple italics we state what this assignment asks you to do in order to demonstrate your achievement of each MLO.

- Apply concepts and justify decisions when modelling, designing and constructing practical examples or paper descriptions of applications in this area.

You are provided with data and a working project definition for a Simple Star. Demonstrate your ability to design, apply and discuss FOUR sets of changes to implement more advanced concepts.

A range of suggestions for concept topic areas is given.

- Describe and critically evaluate the role and relevance of data warehousing and analytical investigation to the solution of business information problems.
and ...

- Explain the concepts that underpin the subject area of data warehousing, making reference to main established concepts and some developing areas.

Using online and / or printed literature sources, critically evaluate your implementation, in particular addressing the question "Except for scale (number of rows of data), what is different about your implementation compared to what you would expect for a full industrial* DW implementation?"

Part A Amend/extend an existing Data Warehousing implementation.

You are provided with all the data and a working project definition for a Simple Star. Demonstrate your ability to design, apply and discuss FOUR sets of changes to implement more advanced concepts.

A range of suggestions for concept topic areas is given.

During the module you have been provided with an SQL Database "SalesStarAssignment" containing data tables forming a Simple Star, plus a Visual Studio (aka Data Tools) Project to generate a Data Cube. The teaching notes also provided small number of NewFacts tables, each representing three months and one day's further information on sales.

Amend and/or extend this application in FOUR of the following ways

- Ensuring data quality

The datasets provided for this assignment have been checked to ensure that all data is valid. Even within the provided FactUpdate tables there are no incorrect dates, and no references to items that are not listed in the Dimension tables.

In real implementations it is very unsafe to assume that all entries in NewFacts are valid, because these tables are usually compiled from data from operational data sources, and those sources may have low quality control.

- Use literature (cite your sources, of course) to find what you believe to be the most common data quality problems that occur with new facts.

- Edit one of the provided FactUpdate tables to include examples of the problems you list.
Implement processes to detect and resolve the "problems" you introduced into the data set. In your report illustrate and explain your processes.

- Implementing additional Calculated Measures and Key Performance Indicators (KPIs)

The teaching notes for this module show how to implement Calculated Measures and KPIs.

- Identify further situations where Calculated Measures and KPIs would be useful for data analysis and implement your own examples. In your report illustrate and explain your work. Higher marks are available for examples that go beyond the module's basic teaching notes - cite the literature sources for all techniques you use that go beyond the course notes.

- Integration of data from different sources including data conformation

Nearly every Data Warehouse for analytic systems draws data from multiple sources. For example, sales could be made in a number of different countries and reported in slightly different forms (eg different currencies, or using different codes/names for the same items).

Assume the "SalesStarDemo" company is receiving its sales data from two franchised outlets, each of which report its data in formats that are not encoded in the same way as the provided Facts table. For example, for some fields the two sources don't even the same labels for the same things (they might sell the same product, but under different branded names).

- Create two tables, each representing a day's sales for one of the franchises (the number of records does not need to be large). Implement and explain processes to generate a unified NewFacts table that is appropriate for then uploading into the Production star.

- Explain any scripts and Staging area tables (eg lookup tables) you create. Wherever you use techniques you read about in the literature, cites sources.

- Cubes with multiple fact tables

The above example discusses a situation where multiple sources are effectively all about the same things (sales by our company), but coded inconsistently. During the Extract-Transform-Load processes, these sources are integrated and recoded into a single NewFacts table, which can then be appended to the Simple Star held within SalesStarDemo.

In other situations the multiple sources may be about related but different things. In this case it is often appropriate to generate Complex Stars with multiple Facts tables, and new Dimension tables.

Assume that our case study company is able (legally! perhaps though a market research company) to get hold of a summary of competitors' sales data on a monthly aggregated basis (i.e. once a moth we receive a report listing totals of HOW MUCH each competitor has sold in that month, but not to which customer or exactly what date). You may need to make other assumptions about the contents or level of detail in the summary report.

- Show and explain the following: Change the provided Data Warehouse to implement the Competitor data as a separate Facts table. Populate the new Facts table with suitable data. Rebuild the Cube such that it now has two Facts tables. Demonstrate the use of the Cube. If you make other design decisions, explain these. Cite any literature sources of help.

- Other (Eg Visualisation, Slowly Changing Dimensions, Use of Tabular facilities, your own choice)

- The module teaching notes make brief mention of quite a number of other techniques not listed in the "dot" titles above. Learn about one of these from literature (eg online tutorials; give references) and apply the approach to the "SalesDemo" dataset.

Marks will be awarded according to the extent of independent learning you demonstrate.

The total writing for Part A should be around 1200 words. Words beyond 1400 will not be read.

Part B

Using literature sources, critically evaluate your implementation, in particular addressing the question "Except for scale (number of rows of data), what is different about your implementation compared to what you would expect for a full industrial DW implementation?"

To answer Part B well you will have to establish what methods of implementation there are for full commercial or production analytical data warehousing.

You can get this information by reading books, journals and company white papers. You can use video lectures, tutorials, people's online blogs, company adverts too - but do not rely only on non-peer reviewed sources.

Many items can be accessed online books through the Library "Gateway" facility - see menu bars of Blackboard.

Almost certainly you will discuss what several authors say, highlighting the similarity or differences between their answers (try to analyse why they differ - eg what perspective are the authors taking? when was their document was written? does it have a bias towards a particular application/usage sector, etc), and you will use this comparison to review some pieces of your implementation, to explain to what extent your work is representative of what DW industry or researchers say is the topic of "data warehousing".

It is recommended that you pick a small number of topics (eg three, or four) and discuss these in detail rather than take a large number of topics and only discuss each in trivial depth.

Examples of topics you might discuss are:-

- What methodologies are used to structure data warehousing projects ("Inmon vs Kimball" is a good search starting point). What does the DW industry use? The approach you have used is closest to which?

- Industry-scale Data Warehouses probably use many tools to help automate routine processes. What are the main tools? Which are the processes most often covered by tools? How far do your scripts/processes illustrate the PRINCIPLES covered by the tools?

- Why does real data need so much data cleansing? What is industries' practice about data cleansing? In what ways does (or could) your assignment solution simulate what industry does?

- Results of Analytical data investigation are often presented visually (eg via graphs or displays). Why? You will have used a particular tool for your implementation (probably Excel). How representative of analytic visualisation tools is your assignment?

- Managing metadata is important. What is metadata? How does metadata help? What examples of metadata are there in your implementation?

Notice that we recommend that you discuss THREE areas, yet we have already listed FIVE topic examples. This is to illustrate that there is a wide range to choose. You can choose other topic areas yourself.

The total writing for Part B should be around 1200 words. Words beyond 1400 will not be read.

Citations and References

You MUST use APA-style for citations and referencing. This is SHU requirement (not just a module requirement).

DBMS, Programming

  • Category:- DBMS
  • Reference No.:- M92278052
  • Price:- $60

Guranteed 36 Hours Delivery, In Price:- $60

Have any Question?


Related Questions in DBMS

Data mining assignment -in this assignment you are asked to

Data Mining Assignment - In this assignment you are asked to explore the use of neural networks for classification and numeric prediction. You are also asked to carry out a data mining investigation on a real-world data ...

Sql query assignment -for this assignment you are to write

SQL Query Assignment - For this assignment you are to write your answers in a word document. This assignment is in three parts: Part A (reporting queries), Part B (query performance), Part C (query design). For this assi ...

The groceries datasetimagine 10000 receipts sitting on your

The groceries Dataset Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer's basket. That ...

You are in a real estate business renting apartments to

You are in a real estate business renting apartments to customers. Your job is to define an appropriate schema using SQL DDL in MySQL. The relations are Property(Id, Address, NumberOfUnits), Unit(ApartmentNumber, Propert ...

Objectivethe objective of this lab is to be familiar with a

OBJECTIVE: The objective of this lab is to be familiar with a process in big data modeling. You're required to produce three big data models using the MS PowerPoint software. This tool is available on UMUC Virtual Deskto ...

The relation memberstudentid organizationid roleid stores

The relation Member(StudentId, OrganizationId, RoleId) stores the membership information of student joining organization. For example, ('S1', 'O2', 'R3') indicates that student with Id 'S1' joined the organization with i ...

Relational database exerciseyou have been assigned to a new

Relational Database Exercise: You have been assigned to a new development team. A client is requesting a relational database system to manage their present store with the anticipation of adding more stores in the future. ...

Relational database design a given the following business

Relational Database Design A) Given the following business rules, identify entity types, attributes (at least two attributes for each entity, including the primary key) and relationships, and then draw an Entity-Relation ...

We can represent a data set as a collection of object nodes

We can represent a data set as a collection of object nodes and a collection of attribute nodes, where there is a link between each object and each attribute, and where the weight of that link is the value of the object ...

Data model development and implementationpurpose of the

Data model development and implementation Purpose of the assessment (with ULO Mapping) The purpose of this assignment is to develop data models and map Database System into a standard development environment to gain unde ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As