| Pages Definition | holds information for pages you want checking.
If you want to check the same page with different search engines or different search words you will need several entries
in this table. Before filling the information into records in this table you should run a search with the intended search
engine from your browser as normal. Look at how the text is formatted in the generated url. The best option is to cut
and paste the parts into the Page and Engine records. |
| urltxt | the string containing the url you want to search for |
| searchtext | the text as entered in the url sent to the search engine. Spaces
are not allowed in urls and if you enter them here they will be replaced by +s or %20. If you want to search for word
combinations using quotes you should use %22 |
| engineid | the record ID for the engine defined in the Engine table |
| maxpages | the number of pages to search through before it gives up |
| lastchecked | filled in by Brightdayler as it monitors the search engines |
| days between checks | how long to leave it before re-checking. You can have
fractions of a day i.e. 0.25 would check every six hours |
| Search Engine Definition | holds information for extracting data from whatever
search engine you want to use. This has not been hard coded into the application in order to give flexibility as search
engines evolve: |
| url part 1 | up to the beginning of the search words |
| url part 2 | from the end of the search words to the beginning of the page
number |
| url part 1 | from the end of the page number |
| recStart | text to uniquely identify the start of each search result
(view the source of the results page with a text editor) |
| startCache | text to uniquely identify the start of the url holding the
cached results for the page |
| endCache | text to uniquely identify the end of the url holding the cached
results for the page |
| startDate | text to uniquely identify the start of the date text on the cached
results page |
| endDate | text to uniquely identify the end of the date text on the cached
results page |
| recordsPPage | the number of records to allow per page when checking through
search results |