Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Engineering Expert

As different authors tend to use different vocabularies and to use common words with differing frequencies. Given an essay or other text, it is interesting to find what distinct words are used and how many times each is used.

I'm trying to produce a driver program and the information-retrieval package using ordinary binary search trees.

I really want to complete this program using the outline I'm presenting you here:

1) Create the data structure (binary search tree).

2) Ask the user for the name of a text file and open it to read.

3) Read the file, split it apart into individual words, and insert the words into the data structure. With each word will be kept a frequency count (how many times the word appears in the input), and when duplicate words are encountered, the frequency count will be increased. The same word will not be inserted twice in the tree.

4) Print the number of comparisons done and the CPU time used in part 3. (By CPU time, I mean the amount of time for which the CPU was used for processing the comparisons (reading the file, splitting & inserting words)).

5) That I can, If I wish, print out all the words in the data structure, in alphabetical order, with their frequency counts.

6) Put everything in parts 2-5 into a do . . . while loop that will run as many times as the user wishes. Thus the user can build the data structure with more than one file if desired. By reading the same file twice, the user can compare time for retrieval with the time for the original insertion. (Let me clarify here, because parts 2 to 5 will be in a "do while" loop, then I can compare time for retrieval with the time for the original insertion).

Some additional specifications I was trying to implement to the program:

* I want the input to the driver will to be a file. The program will be executed with several different files; the name of the file to be used should be requested from the user while the program is running.

* A word is defined as a sequence of letters, together with apostrophes (') and hyphens (-), provided that the apostrophe or hyphen is both immediately preceded and followed by a letter. Uppercase and lowercase letters should be regarded as the same (by translating all letters into either uppercase or lowercase, as you prefer). A word is to be truncated to its first 20 characters (that is, only 20 characters are to be stored in the data structure) but words longer than 20 characters may appear in the text. Non-alphabetic characters (such as digits, blanks, punctuation marks, control characters) may appear in the text file. The appearance of any of these terminates a word, and the next word begins only when a letter appears.

* The program should not allow to be changed at all when I change implementation of data structures later.

Final specifications for the functions to I want implemented first with binary
search trees:
=========================================================
void update(const String &word,
Search_tree< Record > &structure, int &num_comps);

postcondition: If word was not already present in structure, then word has been inserted into structure and its frequency count is 1. If word was already present in structure, then its frequency count has been increased by 1. The variable parameter num_comps is set to the number of comparisons of words done.
=========================================================

void print(const Search_tree< Record > &structure);

postcondition: All words in structure are printed at the terminal in alphabetical order together with their frequency counts

Attachment:- Binary Search Tree.zip

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M91614950
  • Price:- $20

Priced at Now at $20, Verified Solution

Have any Question?


Related Questions in Computer Engineering

A recent study reported that the prevalence of

A recent study reported that the prevalence of hyperlipidemia is 30% in children 2 to 6 years of age. If 12 children are analyzed what is the probability that at least 3 are hyperlipidemic?

How does the learning environment effect the success of

How does the learning environment effect the success of students? Provide examples.

Question suppose a process in host c has a udp socket with

Question : Suppose a process in Host C has a UDP socket with port number 6789. Suppose both Host A and Host B each send a UDP segment to Host C with destination port number 6789. Will both of these segments be directed t ...

Question superfast software inc was founded last year by

Question : Superfast Software Inc. was founded last year by three young programmers. They all dreamed their company would become a really big one and would distribute a large number of software products all over the worl ...

Suppose in your company you formulate a python script that

Suppose in your company you formulate a Python script that inserts, updates, and deletes data in tables in a MySQL database. You post your Python script on a shared drive for other staff members to use. What are some the ...

Elmers utility function isnbspuxnbspy minxnbspy2 if the

Elmer's utility function is  U ( x ,  y ) = min{ x ,  y 2 }. If the price of  x  is $10 and the price of  y  is $15 and if Elmer chooses to consume 4 units of  y , what must his income be? a. $220 b. $100 c. $320 d. Ther ...

Question suppose you are designing a database for a library

Question : Suppose you are designing a database for a library, the library system contains information about people who borrow. Each person is uniquely identified. People can search and borrow books. A book has book ID. ...

Question a show the results of inserting 10 12 1 14 6 5 8

Question : a) Show the results of inserting 10, 12, 1, 14, 6, 5, 8, 15, 3, 9, 7, 4, 11, 13, and 2, one at a into an initially empty binary heap. Show the tree at each stage. b) Show the result of performing three DeleteM ...

Taylor found that 8 of the recipients of loans form a

Taylor found that 8% of the recipients of loans form a particular mortgage lender default within 3 years. If he takes a random sample of 736 customers who received loans 3 years ago, what is the average number of custome ...

You are on a system in which the finger program has been

You are on a system in which the finger program has been disabled and you want a quicky finger type program and you decide that greping/etc/passwd would be sufficient. However the system that you are on uses nis+ and so ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As