Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Programming Language Expert

1 Introduction

In this assignment, you will carry out a number of exercises to investigate the creation and searching of heap and sorted files.

General Requirements

This section contains information about the general requirements that your assignment must meet. Please read all requirements carefully before you start.

1. You must implement your programs in C. Your programs must be well written, using good coding style and including appropriate use of comments.

2. Your programs may be developed on any machine, but must compile and run on yallara.

3. Any code you submit must be able to be built using a single Makefile with the command:

> make all

Tasks

a) Writing a heap file

prepare a program to create a heap file that holds the records currently in the file /scratch/ DatabaseSystems/DATA/data 2012 on yallara. The source records are variable-length. However, the heap file should hold fixed-length records. Create the new records according to the schema given in Table 1. All attributes with Unsigned Int type must be stored in binary, e.g. if the value of ID is equal to 70, it must be stored as 70 (in decimal) or 46 (in hexadecimal; in C: 0x46). It must not be stored as the string “70”, occupying two bytes. Your heap file is therefore a binary file. For simplicity, the heap file does not need a header (containing things like the number of records in the file or a free space list). The file should be packed, i.e. there is no gap between 2 records.

Table : Relation schema.

Attribute name          Data type           Size (bytes)
NAME                        Char                      13
RACE                         Unsigned Int             2
CLASS                       Unsigned Int             2
ID                             Unsigned Int             4
GUILD                          Char                     30
Total size:                                               51

Note that you will need to ensure that the size of each record matches the size shown in Table . To ensure that records are correctly packed in memory, you may need to use the following directive in your code before you define your structure:

#pragma pack(1)

The executable name of this program must be wHeap and should be executed using the command:
> ./wHeap data_2012 heap pagesize

where data 2012 is the input file, heap is an output file to which your converted data is written, and pagesize is an integer specifying how many records fit into a “page” of your file.

Your program should prepare out one “page” of the file at a time (for ex, with a pagesize of 100, you would prepare out 100 records to disk at a time). Your wHeap program must not output anything to stdout.

b) Search on a heap file

Look at the file /scratch/DatabaseSystems/DATA/search 2012. This is a text file containing search key values; each entry is a particular ID (in the schema given above). You are to simulate searching over a heap file, with different assumptions for the size of file pages.

prepare a program to perform equality search operations on the heap file produced by your wHeap program in Section a. The executable name of this program must be sHeap and it must be able to be executed using the command:

> ./sHeap search_2012 heap pagesize

where search 2012 is the name of the file containing the keys to be searched for; heap is the output file of our wHeap program; and pagesize is an integer value that specifies the size of the disk page that you are simulating.

Your program should read in the file, one “page” at a time. For ex, if the pagesize parameter is 100, your program should read in the first 100 records from disk. These can then be scanned, in-memory, for a match. If a match is found, print the matching record to stdout.
You should assume that ID is a primary key. If no match is found, read in the next pagesize records of the file. The process should continue until either a matching record is found, or there are no more records in the file to process.

If a match is found, the program must print the matching record to stdout. If no match is found, a suitable message should be printed. In addition, the program must always output the total time taken to do all the search operations in milliseconds to stdout. For ex, if the time taken to do the reading is 123.45 ms, the output would be:

Time: 123.45 ms

c) Writing a sorted file

prepare a program to create a sorted file that stores the records currently in the file /scratch/ DatabaseSystems/DATA/data 2012 on yallara. You may modify your code from Section

Records should use the same fixed-length schema given previously, and should again be written in binary. When inserting the records into your new file, they should be sorted on an appropriate attribute. You will need to choose a sensible sorting algorithm, appropriate to the data that you are dealing with. You should implement this sorting algorithm yourself.

The executable name of this program must be wSort and it must be able to be executed using the command:

> ./wSort data_2012 sorted pagesize

where data 2012 is the input file; sorted is an output file to which your converted data is written; and pagesize is an integer specifying the size of a page of the file (that is, the number of records that can be stored per page).

Like wHeap, your wSort program must not output anything to stdout.

d) Search on a sorted file

prepare a program to simulate searching over a sorted file, with different assumptions for the size of file pages. prepare a program to perform equality search operations on the sorted file produced by your wSort program in Section c. The executable name of this program must be sSort and it must be able to be executed using the command:

> ./sSort search_2012 sorted pagesize

where search 2012 is the name of the file containing the ID keys to be searched for; sorted is the output file of our wSort program; and pagesize is an integer value that specifies the size of the disk page that you are simulating.

Your program must take advantage of the assumption that sorted is a file whose structure has been created by sorting on the ID key. Your program should read in required parts of the file, one “page” at a time. For ex, if the pagesize parameter is 100, your program should fetch 100 records in a single read from disk. These can then be scanned, in-memory, for a match.

If a match is found, the program must print the matching record to stdout. If no match is found, a suitable message should be printed. In addition, the program must output the total time taken to do all the search operations in milliseconds to stdout. For ex, if the time taken to do the reading is 123.45 ms, the output would be:

Time: 123.45 ms

Experiments and Analysis

In this section, you will be asked to carry out a number of experiments and to analyse your results. Create a file called report.txt. Use this file to record your answers to the following problems:

a) Searching with a heap file

• Put a heading “Equality search (heap file)” in your report.

• Run your sHeap program with pagesize settings of: 100; 1,000; 10,000. For each of these pagesize settings, run your program 10 times. Create a table in your report, and record the timing results, including the date and time of each run.

• find out the average and standard deviation of the running times for each pagesize, and record them in your report.

b)Searching with a sorted file

• Put a heading “Equality search (sorted file)” in your report.

• Run your sSort program with pagesize settings of: 100; 1,000; 10,000. For each of these pagesize settings, run your program 10 times. Record the timing results in a table in your report, including the date and time of each run.

• find out the average and standard deviation of the running times, for each pagesize, and record them in your report.

c) Comparison of approaches

• Put a heading “Comparison of approaches” in your report.

• With reference to your experimental results above, discuss the advantages and disadvantages of the heap and sorted file organisations. Do the trends change for different page sizes? Are the results what you would have expected to see based on your theoretical understanding of these file organisations? Why or why not? Limit your discussion to half a page.

d) Theory

• Answer the following problems.

1. Suppose that instead of equality searches, you were carrying out range searches. Would you expect your results to change? Which of the file organisations would you prefer? As part of your discussion, you should demonstrate your understanding of the properties of the file organisations. Limit your discussion to half a page.

2. Now, suppose that instead of equality searches, you were inserting new records into the file. Which of the file organisations would you prefer? As part of your discussion, you should demonstrate your understanding of the properties of the file organisations.

Programming Language, Programming

  • Category:- Programming Language
  • Reference No.:- M91943

Have any Question?


Related Questions in Programming Language

Task working with arraysoverviewin this task you will

Task: Working with Arrays Overview In this task you will create a simple program which will create and work with an array of strings. This array will then be populated with values, printed out to the console, and then, w ...

Question 1 what is a computer program what is structured

Question: 1. What is a Computer program? What is structured programming? 2. What is modular programming? Why we use it? 3. Please evaluate Sin (x) by infinite series. Then write an algorithm to implement it with up to th ...

Task silly name testeroverviewcontrol flow allows us to

Task: Silly Name Tester Overview Control flow allows us to alter the order in which our programs execute. Building on our knowledge of variables, we can now use control flow to create programs that perform more than just ...

Assignment - proposal literature review research method1

Assignment - Proposal, Literature Review, Research Method 1. Abstract - Summary of the knowledge gap: problems of the existing research - Aim of the research, summary of what this project is to achieve - Summary of the a ...

Question - create a microsoft word macro using vba visual

Question - Create a Microsoft Word macro using VBA (Visual Basic for Applications). Name the macro "highlight." The macro should highlight every third line of text in a document. (Imagine creating highlighting that will ...

Php amp session managment assignment -this assignment looks

PHP & SESSION MANAGMENT ASSIGNMENT - This assignment looks at using PHP for creating cookies and session management. Class Exercise - Web Project: Member Registration/Login This exercise will cover adding data connectivi ...

Assignment - haskell program for regular expression

Assignment - Haskell Program for Regular Expression Matching Your assignment is to modify the slowgrep.hs Haskell program presented in class and the online notes, according to the instructions below. You may carry out th ...

Task arrays and structsoverviewin this task you will

Task: Arrays and Structs Overview In this task you will continue to work on the knight database to help Camelot keep track of all of their knights. We can now add a kingdom struct to help work with and manage all of the ...

Structs and enumsoverviewin this task you will create a

Structs and Enums Overview In this task you will create a knight database to help Camelot keep track of all of their knights. Instructions Lets get started. 1. What the topic 5 videos, these will guide you through buildi ...

Question 1 what is hadoop explaining hadoop 2 what is

Question: 1. What is Hadoop (Explaining Hadoop) ? 2. What is HDFS? 3. What is YARN (Yet Another Resource Negotiator)? The response must be typed, single spaced, must be in times new roman font (size 12) and must follow t ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As