Design and implement an AVL tree algorithm that searches a collection of documents. You will be provided with a set of 50 documents and a set of sample queries. First, you will process the documents and store their content (i.e. words / tokens) in the data structures that you selected (in information retrieval, this phase is called indexing). Next, for every input query, you will process the query and search its keywords in the documents. (this phase is called retrieval). For each such query, you will have to display the documents that satisfy the query. The queries may contain simple Boolean operators, that is AND and OR, which act in a similar manner with the well known analogous logical operators. For instance, a query: "Keyword1 AND Keyword2" should retrieve all documents that contain both these keywords (elements). "Keyword 1 OR Keyword 2" instead will retrieve documents that contain either one of the two keywords.
Example
Consider the following sample documents.
Doc1: I like the class on data structures and algorithms.
Doc2: I hate the class on data structures and algorithms.
Doc3: Interesting statistical data may result from this survey.
Here are the answers to some queries:
Query 1: data
Doc1, Doc2, Doc3
Query 2: data AND structures
Doc1, Doc2
Query 3: like OR survey
Doc1, Doc3
Hints
Take a look first at the format of the documents.
You will have to parse the input. You may ignore all lines starting with "<", these are all SGML tags that are
useful for certain tasks, but you will probably not find them very useful in this project. The punctuation is
already separated from the words, so you do not have to worry about that. You will have to read one word at a
time and add it to your data structure.