FindinSite-JS: Search engine for a Java server website   .
  search
Powered by FindinSite-JS
. Home | Installation | Indexing | Configuration | Advanced | Purchasing .
. .
  Init parameters | Look and Feel | Languages | Word highlighting | Runtime parameters | Rules | Subsets | Meta-data fields

 

findinsite-js word highlighting


findinsite-js can highlight search words in result web pages. Highlighting is on by default, but can be turned off in the online configuration screen Look and Feel screen - see below for details of how to change the highlighting HTML. It is recommended that you only use highlighting if you are displaying hits on your own web site.

Word highlighting is very useful because it lets the user see their search words on the page straight away, making the search process much more friendly. findinsite-js does not use cached web pages when highlighting - it uses the live page.

Example:

If you do a search for brown car for example, the hits are listed as normal. If you click on a hit in an HTML web page, then findinsite-js displays the page with all search words and their variants highlighted. The page is scrolled to show the first highlight.

Josie jumped out of the car and landed in the brown mud.
Brown cars came past and splashed her.
"Brown car, go away!" shouted Josie.

By default, a header is added to the page to tell the user that it has been highlighted by findinsite-js. This can be switched off in the configuration screen, if desired. Example header:

Page http://www.phdcc.com/fisjs/highlite.htm highlighted by findinsite-js

In the results list, clicking on the hit link will show the page with highlighting. If you right-click and choose any option (eg Open in New Window) then the hit page is shown without highlighting.

Cross-domain highlighting

The highlighting process will work across domain boundaries, so findinsite-js on our web site http://www.findinsite.info/ can highlight pages on your domain, eg www.example.org.

  • Warning:  While the highlighting method (see below) should correctly resolve links correctly, it is possible that mischievous code could interfere at some point with the findinsite-js web site. It is therefore recommended that you only highlight on your own web site.

  • In a very small number of cases, highlighted pages are not shown correctly if the findinsite-js domain is different from the site domain. If there is a problem, either switch off highlighting, or run findinsite-js at the site domain.

    The one case that we have found is an unusual frameset, where a frameset is embedded in a larger page and created dynamically after the main page has been loaded. The problem is that the FRAME SRC is loaded by the browser from the findinsite-js site, not the searched site. The Cached page shown by a major search engine also suffers from this problem.


Word Highlighting Technical Details

Overview:  findinsite-js highlights words in a result web page by:
1.  Reading the result web page
2.  Adding in highlighting HTML
3.  Returning the amended web page to the user

When the user clicks on a (highlighting) link in the result list, findinsite-js is called again. The link URL includes parameters to tell findinsite-js which page to highlight and what words to highlight.

findinsite-js retrieves the requested page. It then adds in the highlighting HTML and returns it to the browser. Normally this process would not work because all the page links would go wrong - findinsite-js gets round this problem by adding a <base> tag at the top of the page. For example, for this page online, it would add in the following, together with the explanatory header:

<base href='http://www.phdcc.com/fisjs/highlite.htm'>

findinsite-js passes all HTTP request headers to the requested page, and returns all received HTTP response headers in its response.

If there are any problems in the above process, then findinsite-js aborts highlighting and redirects the browser to show the page without highlighting.

Note that the above process works with any sort of URL that produces HTML output, even dynamically generated pages produced by ASP, ASPX or PHP pages. Also note that findinsite-js always requests the live page - it does not use a cached copy of the page, which could be out of date.


Changing the Highlighting HTML

findinsite-js highlights found search words by inserting HTML before and after the found words. The default highlighting uses a background colour of yellow and a bold font colour of red. This is achieved using the following HTML:

highlightStart <span style='background:yellow;color:red;font-weight:bold;'>
highlightEnd </span>

You cannot change the highlighting definitions in the online configuration screen because HTML entry is disallowed for safety reasons. To change the highlighting HTML, you must therefore edit the findinsite-js settings file fisjs.properties in the work directory.

fisjs.properties stores various settings in the Java properties file format. The highlightStart and highlightEnd tags store the HTML that start and end highlighting. Characters must be 'escaped' if they cannot be represented in the ISO 8859-1 character encoding - see here for details of Unicode Escapes.

The default values should be stored as follows:

highlightStart=<span style='background:yellow;color:red;font-weight:bold;'>
highlightEnd=</span>

Carefully edit the fisjs.properties directly, eg using FTP software download a copy of this file, edit it in Notepad or similar, then upload using FTP. findinsite-js will only read its setting file when it restarts. Either wait for a restart, or stop and start the application, eg in the Tomcat Manager.

  All site Copyright © 1996-2011 PHD Computer Consultants Ltd, PHDCC   Privacy  

Last modified: 27 September 2005.

Valid HTML 4.01 Transitional Valid CSS!