Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Engineering Expert

Consider an inverted index containing, for each term, the posting list (i.e. the list of documents and occurrences within documents) for that term. The posting lists are accessed through a B+ tree with the terms serving as search keys. Each leaf of the B+ tree holds a sublist of alphabetically consecutive terms, and, with each term, a pointer to the posting list for that term. 

Part a. An artificially small example of a B+ tree is shown here (pdf). (Note only part of the tree is shown in detail.) What nodes of the example B+ tree are visited to find the posting list for "dune"?

Part b. Suppose there are 2 million terms for a collection of 32 million documents of total size 200 gigabytes. We would like each internal node of the B+ tree and each leaf of the B+ tree to fit in one 8-kilobyte page of the file system. Recall that a B+ tree has a parameter m called the order of the tree, and each internal node of a B+ tree has between m+1 and 2m+1 children (except the root, which has between 2 and 2m+1). Assume that each term is represented using 16 bytes, and each pointer to a child in the tree or to a posting list is represented using 8 bytes. Find a value for the order m of the B+ tree so that one 8 kilobyte page can be assigned to each internal node and leaf, and so that an internal node will fill, but not overflow, its page when it has 2m+1 children. If you need to make additional assumptions, state what assumptions you are making.

Part c. For your m of Part b, estimate the height of the B+ tree. (Giving a range of heights is fine.) Also estimate the amount of memory needed to store the tree, including leaves but not including the posting lists themselves.

Part d. Estimate the aggregate size of the posting lists.

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M9643326

Have any Question?


Related Questions in Computer Engineering

Research the group members identified in the video hackers

Research the group members identified in the video "Hackers: Operation Get Rich or Die Tryin'"-Albert Gonzalez, Stephen Watt, Damon Patrick Toey, Humza Zaman, and Christopher Scott. Pick a member of the gang and describe ...

Find minimal dfas for the following languages in each case

Find minimal dfa's for the following languages. In each case prove that the result is minimal. (1) L = {a n bm> :n≥2,m≥1}. (2)L = {a n :n ≥ 0,n ≠ 3} (3) L = {a n :n mod 3 = 0}∪{a n : n mod 5 = 1}

Explain that our ability to secure each computers stored

Explain that our ability to secure each computers stored information is now influenced by the security on each computer to which it is connected

A medical researcher is interested in determining whether a

A medical researcher is interested in determining whether a new medication for lung cancer is effective in a group of patients with early-stage disease. Explain what a Type I and Type II error would be in this study. (Be ...

A monochromatic source emitting photons at 250 nm shines

A monochromatic source emitting photons at 250 nm shines with equal intensity on a zinc electrode (threshold n =1.04 x1015 Hz) and a sodium electrode (threshold n = 5.51 x1014 Hz). Which of the following statements is tr ...

Question 1 complete the lab 9-5 programming challenge 1 --

Question: 1. Complete the Lab 9-5, "Programming Challenge 1 -- Going Green," of Starting Out with Programming Logic and Design. Note: You are only required to create the flowchart for this activity; however, notice how t ...

Question in a survey of 182 insurance policies the

Question : In a survey of 182 insurance policies, the following data was obtained 67 policy holders are female. 89 policy holders are under age 30 62 policy holders are married 35 policy holders are women under age 30 31 ...

Systems analysis projectpersonal trainer inc owns and

Systems analysis project Personal Trainer, Inc. owns and operates fitness centers in a dozen Midwestern cities. The centers have done well, and the company is planning an international expansion by opening a new "superce ...

Research the differences in file formats between ms office

Research the differences in file formats between MS Office 2003 and MS Office 2007. Explain how file signatures and metadata can be used to determine which versions of MS Office applications (e.g. Word, Excel, Power Poin ...

The police lieutenant in charge of the traffic division

The police lieutenant in charge of the traffic division reviews the number of traffic citations issued by each of the police officers in his division. He finds that the mean number of citations written by each officer is ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As