Ask Java Expert


Home >> Java

Use Crawler Java Assignment

Review, fix and run the crawler.

Add code for additional requiments.

Make sure you crawler does the following.

Test your crawler only on the data in:

http://lyle.smu.edu/~fmoore

Make sure that your crawler is not allowed to get out of this directory!!! Yes, there is a robots.txt file that must be used. Note that it is in a non-standard location.

The required input to your program is N, the limit on the number of pages to retrieve and a list of stop words (of your choosing) to exclude.

Perform case insensitive matching.

You can assume that there are no errors in the input. Your code should be robust under errors in the Web pages you're searching. If an error is encountered, feel free, if necessary, just to skip the page where it is encountered.

1. Identify the key properties of a web crawler. Describe in detail how each of these properties is implemented in your code.

2. Use your crawler to list the URL of all pages in the test data and report all out-going links of the test data. [10 points] display the contents of the tag</p> <p style="text-align: justify;">3. Implement duplicate detection, and report if any URLs refer to already seen content.</p> <p style="text-align: justify;">4. Use your crawler to list all broken links within the test data.</p> <p style="text-align: justify;">5. How many graphic files are included in the test data?</p> <p style="text-align: justify;">6. Have your crawler save the words from each page of type (.txt, .htm, .html). Make sure that you do not save HTML markup. Explain your definition of "word". In this process, give each page a unique document ID.</p> <p style="text-align: justify;">Implement Stemming</p> <p style="text-align: justify;">7. Report the 20 most common words with its document frequency. words or stemmed words?</p> <p><strong>Attachment:-</strong> <a href="http://sharing.mywordsolution.com/XtringFiles/409_crawler_project.zip" target="_blank">crawler_project.zip</a></p></span> </div></div></div> <div id="viewreadmore" class="que_link"> <a id="readmore" href="javascript:void(0);" style="font-size: 14px; color: #ff700c;" class="read-more-trigger mar_top10" onclick="changeheight(this)">View complete question</a> </div> <div class="innercanteenouter_inner_center3"> <p>Java, <span>Programming</span> </p> </div> <div class="clear"></div> </div> </div> <div class="innercanteenouter_inner_center6"> <div class="innercanteenouter_inner_center6_center7"> <div class="innercanteenouter_inner_center6_center"> <ul> <li class="innercanteenouter1"> Category:- <span>Java</span> </li> <li class="innercanteenouter2"> Reference No.:- <span>M92251333 </span> </li> <li id="lrprice" class="innercanteenouter3"> Price:- <span>$70</span> </li> </ul> <div id="verified" class="innercanteenouter_inner_center6_center1"> <a>Verified Expert</a> </div> <div class="clear"></div> <input type="submit" name="btnASolution" value="Download Verified Solution File" id="btnASolution" class="innercanteenouter_inner_center6_center2" /> <p id="compare">Priced at <label class="priceold">$140</label> Now at $70, Verified Solution</p> </div> <div class="innerjohnanswered1"> <div class="innerjohnanswered1ftti"><h2></h2> <p></p> </div> <div class="clear"></div> </div> </div> </div> <div class="inneranswer"> <b>Have any Question?</b><span id="RequiredFieldValidator1" style="background-color:Yellow;visibility:hidden;">Write your Review or question?</span> <br /><br /> <div class="inneranswerhgrows"> <textarea name="txtReview" id="txtReview" class="inneranswerhgrowss" ValidationGroup="Review" placeholder=" Write your message!!"></textarea> </div> <div class="inneranswerhgrowsddf"> <input type="submit" name="btnReview" value="Have Question" onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("btnReview", "", true, "Review", "", false, false))" id="btnReview" class="planpageinnerrightcontleftcommetgetquotebt" /> </div> </div> <br /> <div class="innerquestions"> <h2>Related Questions in Java</h2> <table id="dlReviewList" cellspacing="0" style="width:100%;border-collapse:collapse;"> <tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_0"><b><a href='http://www.mywordsolution.com/question/chatbotscreate-a-small-networked-chat-application-that-is/93108043' target='_balnk' class='que_link' >Chatbotscreate a small networked chat application that is</a></b></span></h3> <p> <span id="dlReviewList_lblContent_0">Chatbots Create a small, networked chat application that is populated by bots. Introduction On an old server park, filled with applications from the early days of the internet, a few servers still run one of the earliest ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_1"><b><a href='http://www.mywordsolution.com/question/assignment-taskwrite-a-java-console-application-that-allows/93108280' target='_balnk' class='que_link' >Assignment taskwrite a java console application that allows</a></b></span></h3> <p> <span id="dlReviewList_lblContent_1">Assignment task Write a java console application that allows the user to read, validate, store, display, sort and search data such as flight departure city (String), flight number (integer), flight distance (integer), fl ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_2"><b><a href='http://www.mywordsolution.com/question/assignment-game-prototypeoverviewfor-this-assessment-task/93108572' target='_balnk' class='que_link' >Assignment game prototypeoverviewfor this assessment task</a></b></span></h3> <p> <span id="dlReviewList_lblContent_2">Assignment: Game Prototype Overview For this assessment task you are expected to construct a prototype level/area as a "proof of concept" for the game that you have designed in Assignment 1. The prototype should function ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_3"><b><a href='http://www.mywordsolution.com/question/assignment-taskwrite-a-java-console-application-that-allows/93108573' target='_balnk' class='que_link' >Assignment taskwrite a java console application that allows</a></b></span></h3> <p> <span id="dlReviewList_lblContent_3">Assignment task Write a java console application that allows the user to read, validate, store, display, sort and search data such as flight departure city (String), flight number (integer), flight distance (integer), fl ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_4"><b><a href='http://www.mywordsolution.com/question/in-relation-to-javaa-what-is-constructor-the-purpose-of/93108603' target='_balnk' class='que_link' >In relation to javaa what is constructor the purpose of</a></b></span></h3> <p> <span id="dlReviewList_lblContent_4">(In relation to Java) A. What is constructor? the purpose of default constructor? B. How do you get a copy of the object but not the reference of the object? C. What are static variables and instance variables? D. Compar ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_5"><b><a href='http://www.mywordsolution.com/question/project-descriptionwrite-a-java-program-to-traverse-a/93110671' target='_balnk' class='que_link' >Project descriptionwrite a java program to traverse a</a></b></span></h3> <p> <span id="dlReviewList_lblContent_5">Project Description: Write a java program to traverse a directory structure (DirWalker.java) of csv files that contain csv files with customer info. A simple sample in provided in with the sample code but you MUST will r ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_6"><b><a href='http://www.mywordsolution.com/question/fundamentals-of-operating-systems-and-java/93111186' target='_balnk' class='que_link' >Fundamentals of operating systems and java</a></b></span></h3> <p> <span id="dlReviewList_lblContent_6">Fundamentals of Operating Systems and Java Programming Purpose of the assessment (with ULO Mapping) This assignment assesses the following Unit Learning Outcomes; students should be able to demonstrate their achievements ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_7"><b><a href='http://www.mywordsolution.com/question/assessment--java-program-using-array-of/93111542' target='_balnk' class='que_link' >Assessment -java program using array of Assessment -JAVA Program using array of objects</a></b></span></h3> <p> <span id="dlReviewList_lblContent_7">Assessment -JAVA Program using array of objects Objectives This assessment item relates to the course learning outcomes as stated in the Unit Profile. Details For this assignment, you are required to develop a Windowed G ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_8"><b><a href='http://www.mywordsolution.com/question/applied-software-engineering-assignment-1--learning/93111597' target='_balnk' class='que_link' >Applied software engineering assignment 1 -learning</a></b></span></h3> <p> <span id="dlReviewList_lblContent_8">Applied Software Engineering Assignment 1 - Learning outcomes - 1. Understand the notion of software engineering and why it is important. 2. Analyse the risk factors associated with phases of the software development lif ...</span> </p> </div> </td> </tr><tr> <td> <div class="innerquestionsgtr"> <h3> <span id="dlReviewList_lblTitle_9"><b><a href='http://www.mywordsolution.com/question/retail-price-calculatorwrite-a-java-program-that-asks-the/93111887' target='_balnk' class='que_link' >Retail price calculatorwrite a java program that asks the</a></b></span></h3> <p> <span id="dlReviewList_lblContent_9">Retail Price Calculator Write a JAVA program that asks the user to enter an item's wholesale cost and its markup percentage. It should then display the item's retail price. For example: (If an item's wholesale cost is 5. ...</span> </p> </div> </td> </tr><tr> <td></td> </tr> </table> </div> <div class="innerlogo"> <ul> <li><div class="innerlogoback1"></div></li> <li><div class="innerlogoback2"></div></li> <li><div class="innerlogoback3"></div></li> </ul> </div> </div> <div class="innercanteenouter_inner_right"> <div class="innercanteenouter_inner_right1"> <div class="innercanteenouter_inner_right1pou"> <ul> <li class="red1"> <div class="counterDivTag">4,153,160 Questions Asked</div> </li> <li class="oring1"><div class="counterDivTag">13,132 Experts</div></li> <li class="grine1"><div class="counterDivTag">2,558,936 Questions Answered</div></li> </ul> <div class="clear"></div> <h2>Ask Experts for help!!</h2> </div> </div> <div class="innerchoose"> <h2>Looking for Assignment Help?</h2> <p> <b>Start excelling in your Courses, Get help with Assignment</b><br /><br /> Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time. </p> <a href="http://www.mywordsolution.com/free-quote-job-posting.aspx">Ask Now</a> <span>Help with Problems, Get a Best Answer</span> </div> <div class="innertranspoitation"> <div class="innertranspoitation1"> <ul> <li><a href="#">Recent Questions </a></li> </ul> <div class="clear"></div> </div> <table id="LeftSideControl1_RecentQuestion1_dlReviewList" cellspacing="0" style="width:100%;border-collapse:collapse;"> <tr> <td> <div class="innertranspoitation2"> <h3><span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblTitle_0"><b><a href='http://www.mywordsolution.com/question/why-might-a-bank-avoid-the-use-of-interest-rate-swaps-even/93137978' target='_balnk' class='que_link' >Why might a bank avoid the use of interest rate swaps even</a></b></span></h3> <p> <span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblContent_0">Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate</span> </p> </div> </td> </tr><tr> <td> <div class="innertranspoitation2"> <h3><span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblTitle_1"><b><a href='http://www.mywordsolution.com/question/describe-the-difference-between-zero-coupon-bonds-and/93137977' target='_balnk' class='que_link' >Describe the difference between zero coupon bonds and</a></b></span></h3> <p> <span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblContent_1">Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p</span> </p> </div> </td> </tr><tr> <td> <div class="innertranspoitation2"> <h3><span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblTitle_2"><b><a href='http://www.mywordsolution.com/question/compute-the-present-value-of-an-annuity-of--880-per-year/93137976' target='_balnk' class='que_link' >Compute the present value of an annuity of 880 per year</a></b></span></h3> <p> <span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblContent_2">Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As</span> </p> </div> </td> </tr><tr> <td> <div class="innertranspoitation2"> <h3><span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblTitle_3"><b><a href='http://www.mywordsolution.com/question/compute-the-present-value-of-an-1150-payment-made-in-ten/93137975' target='_balnk' class='que_link' >Compute the present value of an 1150 payment made in ten</a></b></span></h3> <p> <span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblContent_3">Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int</span> </p> </div> </td> </tr><tr> <td> <div class="innertranspoitation2"> <h3><span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblTitle_4"><b><a href='http://www.mywordsolution.com/question/compute-the-present-value-of-an-annuity-of--699-per-year/93137974' target='_balnk' class='que_link' >Compute the present value of an annuity of 699 per year</a></b></span></h3> <p> <span id="LeftSideControl1_RecentQuestion1_dlReviewList_lblContent_4">Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As</span> </p> </div> </td> </tr><tr> <td></td> </tr> </table> </div> </div> <div class="clear"></div> </div> <div class="clear"></div> </div> </div> <div class="footer"> <div class="footerouter"> <div class="footerouter_inner"> <div class="footerouter_left"> <ul> <li> <div class="footerouter_left1"> <ul> <li><a href="http://www.mywordsolution.com/homework-help/science/8">Science</a></li> <li><a href="http://www.mywordsolution.com/homework-help/english/390">English</a></li> <li><a href="http://www.mywordsolution.com/homework-help/biology/384">Biology</a></li> <li><a href="http://www.mywordsolution.com/homework-help/humanities/388">Humanities</a></li> </ul> </div> </li> <li> <div class="footerouter_left1"> <ul> <li><a href="http://www.mywordsolution.com/homework-help/engineering/9">Engineering</a></li> <li><a href="http://www.mywordsolution.com/homework-help/programming/10">Programming</a></li> <li><a href="http://www.mywordsolution.com/homework-help/computer-science/11">Computers/IT Courses</a></li> </ul> </div> </li> <li> <div class="footerouter_left1"> <ul> <li><a href="http://www.mywordsolution.com/homework-help/accounting/1">Accounting</a></li> <li><a href="http://www.mywordsolution.com/homework-help/finance/2">Finance</a></li> <li><a href="http://www.mywordsolution.com/homework-help/management/7">Management</a></li> <li><a href="http://www.mywordsolution.com/homework-help/statistics/6">Statistics</a></li> <li><a href="http://www.mywordsolution.com/homework-help/economics/5">Economics</a></li> <li><a href="http://www.mywordsolution.com/homework-help/taxation/4">Taxation</a></li> <li><a href="http://www.mywordsolution.com/homework-help/law/3">LAW Assignment Help</a></li> </ul> </div> </li> <li> <div class="footerouter_left1"> <ul> <li><a href="http://www.mywordsolution.com/homework-help/essay/13">Essay Writing Help</a></li> <li><a href="http://www.mywordsolution.com/homework-help/dissertation/12">Dissertation Help</a></li> <li><a href="http://www.mywordsolution.com/homework-help/thesis-writing/117">Thesis Help</a></li> <li><a href="http://www.mywordsolution.com/homework-help/literature-review/79">Literature Review</a></li> <li><a href="http://www.mywordsolution.com/homework-help/research-paper/78">Research Paper</a></li> <li><a href="http://www.mywordsolution.com/homework-help/solved-classroom-assignments/77">Solved Problems</a></li> <li><a href="http://www.mywordsolution.com/homework-help/popular-courses-and-coursework-help/82 ">Coursework Help</a></li> </ul> </div> </li> <div class="clear"></div> </li> </ul> </div> <div class="footerouter_right"> <div class="footerouter_right0"> <div class="footerpayment"> <div class="footerpaymentscur"></div> <div class="footerpaymentscard"> <ul> <li class="footerpaymentscard1"></li> <li class="footerpaymentscard2"></li> <li class="footerpaymentscard3"></li> <li class="footerpaymentscard4"></li> </ul> </div> </div> <div class="footerouter_right1"> <h2>Follow Us</h2> <ul> <li class="fb"><a href="http://www.facebook.com/Mywordsolution" rel="nofollow" target="_blank"></a></li> <li class="twt"><a href="http://www.twitter.com/Mywordsolution" rel="nofollow" target="_blank"></a></li> </ul> <div class="clear"></div> </div> </div> </div> <div class="clear"></div> </div> </div> <div class="footerbottom"> <div class="footerbottom_inner"> <ul> <li><a href="http://www.mywordsolution.com/">Home</a></li> <li><a href="http://www.mywordsolution.com/about-us.aspx">About Us</a></li> <li><a href="http://www.mywordsolution.com/blog/">Blog</a></li> <li><a href="http://www.mywordsolution.com/faqs.aspx">FAQs</a></li> <li><a href="http://www.mywordsolution.com/ask-question.aspx">Assignment Help</a></li> <li><a href="http://www.mywordsolution.com/homework-help.aspx">Homework Help</a></li> <li><a href="http://www.mywordsolution.com/contactus.aspx">Contact Us</a></li> <li><a href="http://www.mywordsolution.com/questions/archive/">Q&A</a></li> <li><a href="http://www.mywordsolution.com/directory.aspx">Directory</a></li> <li><a href="http://www.mywordsolution.com/sitemap.aspx">Sitemap</a></li> <li><a href="http://www.mywordsolution.com/refund-policy.aspx">Refund Policy</a></li> <li><a href="http://www.mywordsolution.com/privacy-policy.aspx">Privacy Policy</a></li> <li><a href="http://www.mywordsolution.com/terms-and-conditions.aspx">T & C </a></li> <li><a href="http://www.mywordsolution.com/disclaimer-policy.aspx">Disclaimer Policy</a></li> <li> <a href="http://www.mywordsolution.com/copyright-notice.aspx">Copyright Notice</a></li> </ul> <div class="clear"></div> <p>© Copyright 2013-14 <span>mywordsolution.com</span> All rights reserved </p> <script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-38762144-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.mywordsolution.com/library/js/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script> <!--Start of Tawk.to Script--> <script type="text/javascript"> var $_Tawk_API={},$_Tawk_LoadStart=new Date(); (function(){ var s1=document.createElement("script"),s0=document.getElementsByTagName("script")[0]; s1.async=true; s1.src='https://embed.tawk.to/5584638c0ed4869955a0dce2/default'; s1.charset='UTF-8'; s1.setAttribute('crossorigin','*'); s0.parentNode.insertBefore(s1,s0); })(); </script> <!--End of Tawk.to Script--> </div> </div> </div> <script type="text/javascript"> //<![CDATA[ var Page_Validators = new Array(document.getElementById("RequiredFieldValidator1")); //]]> </script> <script type="text/javascript"> //<![CDATA[ var RequiredFieldValidator1 = document.all ? document.all["RequiredFieldValidator1"] : document.getElementById("RequiredFieldValidator1"); RequiredFieldValidator1.controltovalidate = "txtReview"; RequiredFieldValidator1.errormessage = "Write your Review or question?"; RequiredFieldValidator1.validationGroup = "Review"; RequiredFieldValidator1.evaluationfunction = "RequiredFieldValidatorEvaluateIsValid"; RequiredFieldValidator1.initialvalue = ""; //]]> </script> <script type="text/javascript"> //<![CDATA[ var Page_ValidationActive = false; if (typeof(ValidatorOnLoad) == "function") { ValidatorOnLoad(); } function ValidatorOnSubmit() { if (Page_ValidationActive) { return ValidatorCommonOnSubmit(); } else { return true; } } //]]> </script> </form> </body> </html> <script data-cfasync="false" src="/cdn-cgi/scripts/5c5dd728/cloudflare-static/email-decode.min.js"></script>