Ask Question, Ask an Expert

+1-415-315-9853

info@mywordsolution.com

Ask DBMS Expert


Home >> DBMS

problem 1: In the following, assume the latency/transfer-rate model of disk performance, where we estimate disk access times by allowing blocks that are consecutive on disk to be fetched with a single seek time and rotational latency cost (as shown in class). Also, we use the term RID (Record ID) to refer to an 8-byte "logical pointer" that can be used to locate a record (tuple) in a table.

You are given the following very simple schema for a credit card payment database. In this schema, people can make payments to merchants (stores) with their credit cards. All the payment records are stored in the CardCharge table with a unique identifier and a timestamp. (We are only concerned with the time of the charge, not the time the merchant receives the money from the credit card company or the time the customer pays his credit card bill, which both are much later.) The database stores for every person a name, city, state, and SSN. A customer may have several credit cards, for which we store the customer’s SSN, the time the credit card was issued,, and the expiration time. Each merchant has a name, city, state and a unique identifier. The details of the schema are shown below:

Person(ssn, pname, pcity, pstate)

Card(ccn, ssn, issuetime, expiration) // ssn references Person

Merchant(mid, mname, mcity, mstate)

CardCharge(chargeid, ccn, mid, ctime, camount) // ccn reference Card and mid references Merchant

Assume there are 50 million customers, 200 million cards, 1 million merchants and 10 billion charge records. Each tuple is of size 200 bytes, and each ID requires 16-bytes. Consider the following queries:

SELECT chargeid

FROM CardCharge CC, Card C, Person P

WHERE CC.ccn = C.ccn and C.ssn = P.ssn and P.pname = “Cathy Crowbar”

SELECT P.ssn

FROM CardCharge CC, Card C, Person P

WHERE CC.ccn = C.ccn and C.ssn = P.ssn and P.pcity = “Chicago”

SELECT P.ssn

FROM CardCharge CC, Card C, Merchant M, Person P

WHERE CC.mid = M.mid and CC.ccn = C.ccn and C.ssn = P.ssn and CC.camount > 1000 and M.mcity = “Elko” and P.pcity = “New York City”

SELECT C.ssn

FROM CardCharge CC, Card C, Merchant M, Person P

WHERE CC.ccn = C.ccn and CC.mid = M.mid and C.ssn = P.ssn and M.mcity = P.pcity

a) For each query, describe in one sentence what it does. (That is, what task does it perform?)

In the following, to describe how a query is executed, draw a query plan tree and state what algorithms should be used for the various selections and joins. Provide estimates of the running times, assuming these are dominated by disk accesses.

b) Assume that there are no indexes on any of the relations, and that all relations are unclustered (not sorted in any way). Describe how a database system would best execute all four queries in this case, given that 2GB of main memory are available for query processing, and assuming a hard disk with 10 ms for seek time plus rotational latency (i.e., a random access requires 10 ms to find the right position on disk) and a maximum transfer rate of 60 MB/s.

Assume that 2% of all customers live in Chicago and 5% live in New York City, that there are only 5 customers named “Cathy Crowbar”, that there are 200 Merchants Elko, and that 1% of all charges are for more than $1000. Also, if nothing is stated, assume independence (e.g.., customers in Chicago have on average the same number of cards and same spending patterns as the average customer, and if 1% of all people live in Cleveland and 20% of all charges were done in 2011, then 0.2% of all charges were made during 2011 by people living in Cleveland.)

c) Consider a sparse clustered B+-tree index on chargeid in the CardCharge table, and a dense unclustered B+-tree index on mname in the Merchant table, where mname has a (fixed) size of 16 bytes. For each index, what is the height and the size of the tree? How long does it take to fetch a record with a particular key value value using these indexes?

d) Suppose that for each query, you could create up to two index structures to make the query faster. What index structures would you create, and how would this change the evaluation plans and running times? (In other words, redo (b) for each query using your best choice of indexes for that query.)

problem 2:

a) Consider a hard disk with 6000 RPM and 3 single-sided platters. Each surface has 400,000 tracks and 2000 sectors per track. (For simplicity, we assume that the number of sectors per track does not vary between the outer and inner area of the disk.) Each sector has 1024 bytes. What is the capacity of the disk? What is the maximum rate at which data can be read from disk, assuming that we can only read data from one surface at a time? What is the average rotational latency?

b) Suppose we have another disk, different from the one in part (a), with average seek time 4 ms, average rotational latency 6 ms, and maximum transfer rate 60 MB/s. How long does it take to read a file of size 8 KB? How about a file of 80 KB? How about a file of 8 MB? Use both the block model (4KB per block) and the latency/transfer-rate model, and compare.

c) Suppose you have a file of size 81 GB that must be sorted, and you have only 1 GB of main memory to do the sort (plus unlimited disk space). Estimate the running time of the I/O-efficient merge sort algorithm from the class on this data, using the hard disk from part (b). Use the latency/transfer-rate model of disk performance, and ignore CPU performance. Assume that in the merge phase, all sorted runs from the initial phase are merged together in a single merge pass.

d) Suppose you use two (instead of one) merge phases in the scenario in (c). What would be the degree of the merges, and how would this change the running time of the sort?

DBMS, Programming

  • Category:- DBMS
  • Reference No.:- M96027

Have any Question? 


Related Questions in DBMS

Lab 2 modifying a database design in visiothis assignment

Lab 2: Modifying a Database Design in Visio This assignment contains two Sections: Visio Diagram and Design Summary. You must submit both sections as separate files in order to complete this assignment. Use "Appendix A: ...

Question 1use the tables below to answer the following

Question 1: Use the tables below to answer the following questions: customerid customerid firstname lastname city state 10101 John Gray Lynden Washington 10298 Leroy Brown Pinetop Arizona 10299 Elroy Keller Snoqualmie Wa ...

Create a small bank account database with one account table

Create a small Bank Account database with one Account table. The Account table Should have fields for account number, customer last and first names, and current balance. The type of database will be Access .Populate the ...

Database systems assignment - student enrolment database a

Database Systems Assignment - Student Enrolment Database A. Task Description 1. Overview Your task is to create and test a database in MySQL using PhpMyAdmin. You are provided with a scenario and supporting documents, de ...

Analytical assignment individual or group of max two

Analytical Assignment (Individual or Group of Max two students) What is Research article? Review articles provides summary of current state of the research on a particular topic. Ideally, the writer searches for everythi ...

Homeworkgiven the relations below answer the following

HOMEWORK Given the relations below, answer the following queries using SQL. Note that the tables contain sample records only and do not represent all the possible records that could be stored in these tables. MEMB ERS ME ...

Some people used to believe that data warehouses would

Some people used to believe that data warehouses would quickly replace databases for both online transaction processing (OLTP) and online analytical processing (OLAP). Of course, they were wrong. Why can data warehouses ...

Option simulation preparation paperby the end of module 2

Option : Simulation Preparation Paper By the end of Module 2, you will have received the URL, team name (also known as logon ID), and password for the Littlefield Simulation site from your instructor. This week, you must ...

Exercise 1 - html and Exercise 1 - HTML and XML ID Name Stars Facilities Address Distance

Exercise 1 - HTML and XML ID Name Stars Facilities Address Distance from center (KM) Available 1 Les Jardins du Marais 3 Internet 74 rue Amelot, Paris, 75011 2 True 2 Golden Tulip Little Palace 4 Internet Gym Parking Res ...

Suppose that you are the database developer for a local

Suppose that you are the database developer for a local college. The Chief Information Officer (CIO) has asked you to provide a summary of normalizing database tables that the IT staff will use in the upcoming training s ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

WalMart Identification of theory and critical discussion

Drawing on the prescribed text and/or relevant academic literature, produce a paper which discusses the nature of group

Section onea in an atwood machine suppose two objects of

SECTION ONE (a) In an Atwood Machine, suppose two objects of unequal mass are hung vertically over a frictionless

Part 1you work in hr for a company that operates a factory

Part 1: You work in HR for a company that operates a factory manufacturing fiberglass. There are several hundred empl

Details on advanced accounting paperthis paper is intended

DETAILS ON ADVANCED ACCOUNTING PAPER This paper is intended for students to apply the theoretical knowledge around ac

Create a provider database and related reports and queries

Create a provider database and related reports and queries to capture contact information for potential PC component pro