Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask DBMS Expert


Home >> DBMS

Database and Information Retrieval

Question 1: Suppose you have joined a search engine development team to design a search algorithm based on both the Vector model and the Boolean model. You are supposed to collect unstructured documents for the following topics, and apply an index technique to convert them into an inverted index.

Please collect 3 documents (less than 30 words for each) in three different topics. Topics are listed as follows, you can also choose some other topics you prefer.

  • Science
  • Computer Vision
  • Search Engine
  • Database
  • Security and privacy.

An example of document:

"Google is the most widely used Web search engine in the World. It claims to be the World's most comprehensive search engine, indexing over 2.4 billion Web pages."

1. Creating the inverted index. In the process of creating the inverted index, please complete the following steps:

a. Find a stopword list in the Internet and remove all stopwords and punctuation from those three documents.

Then apply Porter's stemming algorithm to all documents. Note that there are plenty of online stemming applications available, and you may use Porter algorithm for this question. The output will be a set of stemmed terms.

b. Create a merged inverted list including the within-document frequencies for each term.

c. Use the index created in step (b) to create a dictionary and the related posting file.

d. You may like to test the inverted index by using some keywords, please select some keywords from the documents. For example: google, web, search.

2. Boolean and Vector queries.

a. Please design three Boolean queries, (for example, web AND search) and list the relevant documents for each query.

b. Please use the Vector model to query on the inverted index, and compare the result with the Boolean model. (Hint: you can use cosine similarity and set a similarity threshold)

Question 2 (IR Evaluation): For this exercise, you are required to evaluate the performance of different search engines.

First, please find two search engines you are familiar with, such as Google, Bing, Yahoo!, etc.

Second, please choose a target in the following groups, and design two queries to search in both search engines.

The target is chosen by the last number of your student ID. For example, if your student ID ends with the number is 1, please choose target 1; if it is 0, please choose target 10.

  • Target 1: obtain the unit guide of SIT771.
  • Target 2: obtain the unit guide of SIT772.
  • Target 3: obtain the unit guide of SIT773.
  • Target 4: obtain the unit guide of SIT774.
  • Target 5: obtain the price of the new Macbook.
  • Target 6: obtain the price of the new iPhone.
  • Target 7: obtain the price of a Lenovo Laptop.
  • Target 8: obtain the install document of MongoDB.
  • Target 9: obtain the manual of MongoDB.
  • Target 10: obtain the operation guide of MongoDB.

Select the first 20 results in both search engines, if they return the target, then mark them as relevant documents, otherwise, they are irrelevant. The following exercises are based on your search results.

a. List your target and designed search queries (you can use any keywords you think are related to the target).

For Search Engine 1, plot the precision versus recall curves for Query 1 and Query 2, interpolated to the 11 standard recall levels.

Also plot the average precision versus recall curve for Search Engine 1 (all three curves should be on a single chart).

b. For Search Engine 2, plot the precision versus recall curves for Query 1 and Query 2, interpolated to the 11 standard recall levels.

Also plot the average precision versus recall curve for Search Engine 2 (all three curves should be on a single chart, but a separate chart from that used in part (a)).

c. Plot the averages for Search Engine 1 and Search Engine 2 on a separate chart, and compare the algorithms in terms of precision and recall. Which search engine do you think is superior? Why?

DBMS, Programming

  • Category:- DBMS
  • Reference No.:- M92678706

Have any Question?


Related Questions in DBMS

Suppose that we have a table of house prices and a table of

Suppose that we have a table of house prices and a table of zip codes: • hprices(hid (PK),address,bedrooms,price,zipcode) • zipcodes(zipcode (PK),state) Write a SQL query that finds the average, maximum, and minimum pric ...

The relation memberstudentid organizationid roleid stores

The relation Member(StudentId, OrganizationId, RoleId) stores the membership information of student joining organization. For example, ('S1', 'O2', 'R3') indicates that student with Id 'S1' joined the organization with i ...

Q1 given the following file for assignment workercom

Q1. Given the following file for assignment worker.com, identify data anomalies that must be removed before data can be loaded in data warehouse. Worker_assignment ← -----------------on course web site File is available ...

Assignmenta restaurant is designing a database to keep

Assignment A restaurant is designing a database to keep track of customer services. A customer is defined as a customer ID, name, address and a telephone number. Customers are served by employees. Each employee is define ...

Question we can sort a given set of n numbers by first

Question : We can sort a given set of n numbers by first building a binary search tree containing these numbers (using TREE-INSERT repeatedly to insert the numbers one by one) and then printing the numbers by an inorder ...

Question lab 1 creating a database designthis assignment

Question: Lab 1: Creating a Database Design This assignment contains two (2) Sections: Database Design Diagram and Design Summary. You must submit both sections as separate files in order to complete this assignment. Not ...

A taking an unnormalised list describe how you would

(a) Taking an unnormalised list, describe how you would normalise it using the normal forms technique and show how the result of this method is used. (b) You are currently in the process of developing a RDBMS for a natio ...

Sql assignmentin these exercises youll enter and run your

SQL Assignment In these exercises, you'll enter and run your own SELECT statements. You will use the MyGuitarShop database for these queries. If you do not already have the MyGuitarShop database, the SQL script and the i ...

Question create the physical data model for the logical

Question: Create the physical data model for the logical data model that you submitted in IP3. This should include all of the data definition language SQL. Your submission should include all DDL needed to: Create the tab ...

Answer the following question explain the difference

Answer the following Question : Explain the difference between a database management system (DBMS) and a database. Are Microsoft Access, SQL Server, and Oracle examples of databases or database management systems (DBMS)?

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As