Ask Computer Engineering Expert

Strings, Structs & Files (CSC100)

Name your C++ source code file: data.cpp

Overview of Problem: Read a text file to create an array containing one structured element for each unique word found in the file, as well as the number of times that word appears in the file regardless of case. Use this array to calculate and display the statistics described below.

A text file contains English text, such as essay.txt available as example input to your program for this assignment.  Your program must read the contents of such a text file, store the unique words and count the occurrences of each unique word.  When the file has been completely read and the array of unique word structs has been set, print the words in sorted (alphabetical) order and the number of occurrences of each word to an output text file.  After this output file has been created and the list of words written to the file, the following statistics must also be written to the output file:

  • Total number of words read
  • Total number of unique words read
  • Average length of a word (as a floating point value)
  • Average occurrence of a word (as a floating point value)
  • Most commonly occurring word(s)

You can see an example input file and its corresponding output file produced by your code here.

NOTE: Throughout this assignment (and this course!), when the data type, string, is mentioned, you are expected to use a c-string, which is an array of char. Do not use the name, string, as a data type directly. To make a variable a string, declare it as an array of char.

Processing Requirements:

  • You will need to prompt the user for the name of the input text file. From this name you are to create the name of the output text file. The output text file should have the same name, but use the extension: .out. So if the input file is named: test.txt, then the output file should be named test.out. Do not prompt for the output file, create it using the input file name. The names of the files are strings, i.e., arrays of char.
  • Close the input file when you have completed reading and close the output file when you have completed writing.
  • In addition to the title at the beginning of the program and the prompts to the user for the name of the input file, print to the screen identification of the different steps happening during the processing. As there is no other output to the screen, this is helpful in identifying the progress of the program.
  • Use an array of structs to hold the words and word counts. Define a struct to hold a string (array of char) and a count (integer). Your main function will then declare an array of these structs. Assume that a word has a maximum length of 20 characters ,so the string (array of char) size must be 21 to allow for the null character.
  • All words stored in the array should be stored in all lowercase. So, if a word appears capitalized in one place and in lowercase in another place, these two occurrences count as two occurrences of the same word, not two different words. See handling of the words "This and "this" in the example output.
  • Declare the array of structs in the main and pass it as a parameter to the other functions. Declaring this array as a global variable is not acceptable. No variables should be global, except for named constants.
  • Break your code into meaningful functions that are short and perform only one, well-defined, easily-understood function.
  • You must use the linear search algorithm to determine if a word is in the array. Remember that the array is an array of structures and that the key is a string (array of char) so the string comparison function, strcmp, must be used. The search task should be a separate function that accepts the array of structs, the number of items currently in the array, and the string value (a word) for which to search. The search function should return the position (index, subscript, an integer) of where in the array the value (word) is found or a -1 if not found. The search function must have only one return statement.
  • Use the selection sort algorithm (given in the array module) to sort the array of structs . To use the given algorithm, you must change it from handling arrays of integers to handling an array of struct. The items being compared will be strings (arrays of char), which will require the use of the strcmp function.
  • Use good programming style. You will, as usual, be graded on: documentation of the program and functions (each function should have its own comment prolog), names of variables, indentation of statements, correct use of loops, correct use of structs and functions and everything else that you learned about this semester!
  • Functions are limited to a length of about 25 statements (not including declarations). Use good modularization criteria to divide longer functions into sub-functions.
  • The example programs in this module demonstrate both reading from and writing to a file. A word is defined as one or more characters terminated by one or more spaces, the end of a line, or the end of a file. Your program should also remove any punctuation from the beginning and end of words. If punctuation is found inside a word (not at beginning or end, such as a hyphen), your program should consider that symbol as part of that word. All words must be stored in lower case since the case of words in the file must be ignored.
  • You can assume a maximum of 500 UNIQUE words. If there are more than 500 UNIQUE words, print an error message (ONCE) and then continue reading the file to count words that have already been stored in the array.
  • Here is an algorithm for the function to read the input file, count words, and store them into the array of structs:

             while not end of file
                read a word from file
                search for this word in the array of structs
                if the word is in the array
                    increment the count in the struct at the position where the word was found
                else add the word to the end of the array and set the count to 1
             end while

  • The text file, essay.txt, may be used to test your program. However, you will want to begin your testing with very small text files containing only one or two words at first. Increase the length of the text file used for testing as you become more confident that your solution is working.

Turn in a single source code file named, data.cpp.

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M92568100
  • Price:- $30

Priced at Now at $30, Verified Solution

Have any Question?


Related Questions in Computer Engineering

Does bmw have a guided missile corporate culture and

Does BMW have a guided missile corporate culture, and incubator corporate culture, a family corporate culture, or an Eiffel tower corporate culture?

Rebecca borrows 10000 at 18 compounded annually she pays

Rebecca borrows $10,000 at 18% compounded annually. She pays off the loan over a 5-year period with annual payments, starting at year 1. Each successive payment is $700 greater than the previous payment. (a) How much was ...

Jeff decides to start saving some money from this upcoming

Jeff decides to start saving some money from this upcoming month onwards. He decides to save only $500 at first, but each month he will increase the amount invested by $100. He will do it for 60 months (including the fir ...

Suppose you make 30 annual investments in a fund that pays

Suppose you make 30 annual investments in a fund that pays 6% compounded annually. If your first deposit is $7,500 and each successive deposit is 6% greater than the preceding deposit, how much will be in the fund immedi ...

Question -under what circumstances is it ethical if ever to

Question :- Under what circumstances is it ethical, if ever, to use consumer information in marketing research? Explain why you consider it ethical or unethical.

What are the differences between four types of economics

What are the differences between four types of economics evaluations and their differences with other two (budget impact analysis (BIA) and cost of illness (COI) studies)?

What type of economic system does norway have explain some

What type of economic system does Norway have? Explain some of the benefits of this system to the country and some of the drawbacks,

Among the who imf and wto which of these governmental

Among the WHO, IMF, and WTO, which of these governmental institutions do you feel has most profoundly shaped healthcare outcomes in low-income countries and why? Please support your reasons with examples and research/doc ...

A real estate developer will build two different types of

A real estate developer will build two different types of apartments in a residential area: one- bedroom apartments and two-bedroom apartments. In addition, the developer will build either a swimming pool or a tennis cou ...

Question what some of the reasons that evolutionary models

Question : What some of the reasons that evolutionary models are considered by many to be the best approach to software development. The response must be typed, single spaced, must be in times new roman font (size 12) an ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As