|Page Contents||Format Code 2.1 Coding 1, Revision 1 Character Encoding Record Definitions Base URL records Supporting Short-cuts Contents tree Examples Possible Improvements Version history|
|See also||character encodings Code 0 Codes 1.x|
This paper describes the file format of a help index file, as used by the PHD Hi HelpIndex applet.
Four format variants are defined. Code 0 and Codes 1.x are obsolete, but still recognised. Please use code 2.1, defined here, with Hi HelpIndex 2.1 or later.
Code 2.1 Revision 1, 6 October 1997, includes Page Tip (type 6) records and is used by Hi HelpIndex versions 2.1.10 and 2.2.10 or later.
Code 2.1 Revision 2, 2 October 2002, includes Include file (type 8) records and is used by Hi HelpIndex versions 2.1.19 or later. Include file (type 8) records are NOT currently supported by Hi Lab.
This documentation is Copyright © 1996, 1997, 2002 PHD Computer Consultants Ltd.
Format Code 2.1 Coding 1, Revision 2
An index file is a list of records which contain the URL and index list.
To make the file smaller, short-cuts reference records defining individual URLs.
BaseURL records can also be used to make an index file shorter.
An index file has a series of character strings, one string per line. Lines usually end in CR and/or LF.
Each line is a record, split into fields by a semi-colon delimiter. The first field has an integer indicating the record type.
The first line must be in 8 bit characters terminated by CR or LF. It always starts with the two characters "HI". In general, subsequent lines need not be in 8 bit characters.
An index file consists of one Header record line (type 0) then a Description record line (type 1), followed by zero or more of these record lines: Base URL (type 3), URL (type 5), Page Tip (type 6), Index Item (type 7) and Include file (type 8) Blank lines and lines with errors are ignored. Error messages are sent to the Java error output or console.
The Header record has Format code and Format sub-code fields. If a format is changed by adding more fields or record types - without changing the definition of the older record types - then the sub-code is incremented. For major changes, the Format code is incremented. Older versions of Hi HelpIndex can safely use newer index files as long as the Format code is the same. New versions of Hi HelpIndex so far have been able to read all the old formats.
The Header record includes a Coding Number field. Currently only one coding is defined (1) which indicates that subsequent lines are made of 8 bit characters terminated by CR or LF.
If there are no Index Item records then there are no keywords. If there are no valid Parent fields in the URL records then there is no Contents.
Comments may be added as extra fields in each record.
Version 2.2+ of Hi HelpIndex supports different
character encodings for index files.
See the usage instructions for details of
how to specify the character encoding.
In a different character encoding, use the same index file layout defined above (ie semi-colon separated strings in CR or LF terminated lines) with the characters in your chosen character encoding.
This means that you can use Unicode or Unicode's UCS Transformation Format (UTF8) (RFC 2044). Here is the full list of supported encodings
Note carefully that Hi Lab does not create index files in different
characters encodings so you must make your own.
Format Code 2.1 Coding 1 Record Definitions
Annotations in Value column
|> 0||Integer value must be greater than zero|
|+||Field may be blank|
|*||Field may contain a short-cut to the URL Link field in a URL record.|
|**||Field may contain a short-cut to the Title field in a URL record.|
Defines the index file format and coding.
This format has code 2 and sub-code 1. The coding is 1.
Defines the index description and creation date.
|3 Base URL||
Defines a base URL, ie the URL prefix for URL records.
Defines a URL and its position in the Contents tree hierarchy.
|6: Page Tip||
Defines a page tip for a URL or the index file.
|7: Index Item||
Defines an Index entry.
|8: Include file||
Specifies an index file to include.
This record type is not currently supported by Hi Lab.
The Base URL record addresses two issues.
First it cuts down the size of URL records, by avoiding repetition of a common URL prefix.
Second, it allows an index to be portable so that the index always - and only - refers to the pages that you want it to refer to. Optionally you may wish to leave out this record (or records), so that the index is relative to its current directory.
Note that if a URL record specifies an absolute URL
(eg http://www.you.com/) then the Base URL
is not used as a prefix even if a Base URL number
Short-cuts are used to reduce the size of index files.
Index Item records may contain short-cuts in their Index, URL link and URL title fields. A short-cut links to a URL record. The short-cut consists of an ampersand character & followed by the URL record URL number.
The Index and URL title fields may contain or start with a short-cut. The short-cut is replaced with the corresponding URL record Title field.
The URL link field may contain or start with a short-cut.
The short-cut is replaced with the URL record
URL link field.
For example, this allows you to use anchor names within pages easily (by appending the
#anchorname directly to a short-cut, eg "&2#top").
URLs defined in URL records
are shown in the Hi HelpIndex Contents tree if the
Parent field is set appropriately.
If the Parent field is empty or -1 then the URL does not appear in the tree.
One or more URLs must have a Parent of zero indicating that they are at root level. Otherwise Parent must contain the URL number of its parent. The parent must be defined beforehand.
For each URL in the tree, you may specify an Icon URL which must be a GIF image, usually about 15x15 pixels. Note that this is relative to the applet's document base (ie no Base URL is prefixed).
If the URL link field is empty
then the record is used to define a Contents entry which does not relate to
an actual URL.
HI0;2;1;1; header record 1;My Site Index;13 March 1997; description record 6;0;This is what we've got...\n\nEnjoy index page tip 3;1;http://www.me.com/;; base url 1 3;2;http://www.you.com/;Main; base url 2 with a Main Target 5;1;1;hello.html;Hello world;;0;root.gif; define a url as tree root with own icon 6;1;Page tip for the above URL 7;&1;&1;&1;_top; reference it once 7;About us;&1#us;&1;; reference it again with an anchor name 7;PHD;&1#us;&1 (About us);; reference same anchor with a different index 5;2;1;us.html;More about us;Main;1;; another URL, child of first URL with Main Target 7;About us (more);&2;&2;; 7;Sales;sales.html#top;Sales information;; index without short-cut 5;3;1;products.html;Our products;;1;; another URL, child of first URL 5;4;;;No linker;;1;; URL with no link 5;5;2;test.htm;Test page;;1;; URL off second Base URL
URL records have been used to show an equivalent Contents tab card for email, and to provide short-cuts for the Index Item records.
The trailing fields have been missed off if they are empty.
HI0;2;1;1; 1;Database example;Dec 17 1996; 5;1;;;PHD;;0 5;2;;mailto:email@example.com;Chris Cant;;1 7;&2;;Director 7;&2;;PHD Computer Consultants Ltd 7;&2;&2;Email: firstname.lastname@example.org 7;&2;http://www.phdcc.com/;Web: www.phdcc.com;_blank 5;3;;;John Cant;;1 7;&3;;Director 7;&3;;PHD Computer Consultants Ltd 7;&3;;France 7;&3;http://www.phdcc.com/;Web: www.phdcc.com
|Code 2.1, Coding 1, Revision 2||2 October 2002||Include file records added|
|Code 2.1, Coding 1, Revision 1||6 October 1997||Page tip records added|
|Code 2.1, Coding 1||26 February 1997||Complete re-arrangement|
|Code 1.2a||17 December 1996||Index Item URL link field may be left out|
|Code 1.2||26 November 1996||Added optional Parent and Icon URL fields to record type 1 (URL).
Added optional Target field to record type 2 (Index Item)
|Code 1.1||15 October 1996||Added optional record type 3 (Base URL)|
|Code 1.0||14 October 1996||Altered the Header record from format code 0|