You may work with one partner on this assignment. Anyone from the class is fine.
I strongly encourage you to work in pairs on the project assignments.
#####################################
Enter next Query String or -1 to quit
#####################################
artificial
Results for your Query "artificial":
-----------------------------------------------
Query: artificial
Query Result NOT Found
in cache <-------- print
out whether or not query is in cache
0 words from query
found in cache <-------- if not, print out
how many words of the query are in the cache
Matching URLs:
cs.brynmawr.edu/~eeaton priority: -8
#####################################
Enter next Query String or -1 to quit
#####################################
artificial intelligence
Results for your Query "artificial
intelligence":
-----------------------------------------------
Query: artificial intelligence
Query Result NOT Found
in cache
1 words from query
found in cache
Matching URLs:
cs.brynmawr.edu/~eeaton priority: -17
#####################################
Enter next Query String or -1 to quit
#####################################
intelligence
Results for your Query "intelligence":
-----------------------------------------------
Query: intelligence
Query Result Found in
cache
Matching URLs:
cs.brynmawr.edu/~eeaton priority: -9
http://download.oracle.com/javase/tutorial/uiswing/components/editorpane.htmlGetting your GUI to display url files and to display search result strings is easy if you use the JEditorPane's setPage and setText methods. Remember to import javax.swing.text.* and java.net.URL along with the standard Swing imports.
You will probably want to use the GridBagLayout to layout the buttons and the text display in the same panel. This will likely give the best results when the Window is re-sized (you want the display parts of the window to re-size vertically, but probably not the buttons and the search text and url text boxes).
To get the query result to print out nicely in the display, you
need to convert the query results to a string (either in html or
ASCII). One way to do this is to add a queryToString method (or
a queryToHTMLString method) of the ProcessQuery class that takes
as input the result returned by the performQuery method and
converts the results to an string generating "\n" for new lines
or < br > for new lines. You can then call setText on the
JEditorPane to display the string.
Note: the links on the displayed webpage don't work. You do not
have to add support to make links work. However, there is a way
to add this functionality, and you can try it out if you'd like.
The name of your GUI class should be WebBrowser.java.
This is also the main class that you will run to launch the
WebBrowser.
You might also find the Java Trail on Swing helpful:
http://download.oracle.com/javase/tutorial/ui/index.html
The cache is represented as a hashtable where the key is the query string. The element stored with the query should be an instance of a new class called SavedResult. A SavedResult contains another hash table storing URLs with their count and a String answer.
hashtable cache ------------------------------------------------------------------------- |key: | | | | | | | | | String | | | | | | | | |element: | | | | | | | | | SavedResult | | | | | | | | | | | | | | | | | | | | | | | | | | | -------------------|----------------------------------------------------- | \/ SavedResult --------- ---------------------------------------- | table |---->| URL, count | | | | | | --------- ---------------------------------------- | answer|----> String containing query result, or null if --------- this entry was created to satisfy a larger | best | query of which this is a part | match | | URL |----> The best matching URL as a string (this is for --------- automatically displaying the best matching URL in your Web browser gui) 1. For each word in query, create an alphabetically ordered query string that is all one case and the same case as you stored tokens into your WordFrequencyTrees 2. Search for the query in the cache (a) if a match is found check answer field of entry if null, then create answer based on table return result and add best matching URL string to this entry as well (b) if a match is not found (A) create a hashtable for the query string for each word in the query (1) if it is in the cache then get its hashtable and merge its contents with the hashtable for the query (2) otherwise (a) create an entry for the word by processing the WordFrequencyTrees for the word and creating a hashtable of (URL, count) entries hashed on URL (the result field is null) (b) merge this hashtable with the hashtable for the query (B) create a result priority queue from the entries in the query's hashtable (priority is related to count field of URL), then use this priority queue to create a String representation of the query answer (C) add the (query, (hashtable, answer)) to the cache (D) return result to caller 3. After each query print out: (a) whether or not the query was found in the cache (b) if not, the number of query words that were found in the cacheThe best place to implement caching is in your ProcessQueries class. However, you may also implement another class to perform the caching and make calls to the ProcessQueries class methods.
Think carefully about which implementation of the HashTable you plan to use for your cache: does it make a difference?
Classes you'll need for this assignment include the classes you developed for part 2 of the project, plus the following (available in your dropbox folder):
html_ignore
File, containing tokens that
should be ignored from an html input file. urlListFile
of URLs on which you tested your
program containing 100 - 200 URLs.