Please use this identifier to cite or link to this item: http://repository.futminna.edu.ng:8080/jspui/handle/123456789/1963
Title: A Full-text Website Search Engine Powered by Lucene and The Depth First Search Algorithm
Authors: Mabayoje, Modina A.
Oni, O. S.
Adebayo, Olawale Surajudeen
Keywords: Full Text search engine
Relational Database
Information Retrieval
Lucene
Depth first search algorithm
Issue Date: 2013
Publisher: International Journal of Computer Network and Information Security (IJCNIS)
Series/Report no.: Volume 5;3
Abstract: With the amount of available text data on the web growing rapidly, the need for users to search such information is dramatically increasing. Full text search engines and relational databases each have unique strengths as development tools but also have overlapping capabilities. Both can provide for storage and update of data and both support search of the data. Full text systems are better for quickly searching high volumes of unstructured text for the presence of any word or combination of words. They provide rich text search capabilities and sophisticated relevancy ranking tools for ordering results based on how well they match a potentially fuzzy search request. Relational databases, on the other hand, excel at storing and manipulating structured data -- records of fields of specific types (text, integer, currency, etc.). They can do so with little or no redundancy. They support flexible search of multiple record types for specific values of fields, as well strong tools for quickly and securely updating individual records. The web being a collection of largely unstructured document which is ever growing in size, the appeal of using RDBMS for searching this collection of documents has become very costly. This paper describes the architecture, design and implementation of a prototype website search engine powered by Lucene to search through any website. This approach involves the development of a small scale web crawler to gather information from the desired website. The gathered information are then converted to a Lucene document and stored in the index. The time taken to search the index is very short when compared with how long it takes for a relational database to process a query
URI: http://repository.futminna.edu.ng:8080/jspui/handle/123456789/1963
ISSN: 2074 – 9090
2074 – 9104
Appears in Collections:Cyber Security Science

Files in This Item:
File Description SizeFormat 
Lucene IJCNISpaper(mabayoje).pdf879.64 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.