This page contains the documentation that is specific to a URL mapping rules file used by the Webserver Search Engine. Be sure to read
iSeries Webserver Search Engine - Getting Started.
A URL Mapping Rules File is needed for two reasons.
- It tells the search engine which port number to use for the particular server.
- It also tells the search engine how to map its resulting physical document names in the file system to logical URLs that will be accepted by the server.
How URL mapping works
Once an index has been successfully built for a set of documents, it can be used to perform a search. The goal, of course, is to display the search results on a browser page and let the users click on the document or documents that satisfy their search request. The URL associated with each document must be formatted in such a way that the HTTP server will recognize it and allow the document to be served. It is the URL Mapping Rules File that provides the Webserver Search Engine with the information it needs to form that URL appropriately.
When a search is done with no mapping rules file specified, the Webserver Search Engine looks for a file named
index_directory/index_name.MAP_FILE. If found it is used as the URL mapping rules file. If not found, no mapping will occur and a document's physical location will be returned to the browser.
In order to understand the Mapping Rules File, it is necessary to understand a little bit about how the IBM HTTP Server works. The IBM HTTP Server is able to have multiple HTTP servers active at once. Each server has an associated configuration and listens on a specified TCP/IP port. For example, the ADMIN server has a configuration that listens on port 2001. You may create your own server instance and associated configuration that listens on port 80 instead or on some other port such as 8080.
Now, when a web browser user wants to access a particular HTTP server, it does not use its name, but rather uses its associated port number. For instance, to send a request to the ADMIN server you would use a URL that begins like this:
http://myserver.com:2001/. To send a request to your own server that is listening on port 8080, you would use a URL that begins like this:
http://myserver.com:8080/. Since there are potentially many servers running at the same time, you must tell the Webserver Search Engine which server to use to serve the documents matching a search request. This is done by specifying in the URL Mapping Rules File, the
URL prefix to use such as
http://myserver.com:8080/.
How to Create a URL Mapping Rules File
A URL Mapping Rules File can be created from the
IBM Web Administration GUI (Advanced tab) when the search index is created or by using the
CFGHTTPSCH CL command with OPTION(*CRTMAPF).
For specific details, see the following:
In addition to specifying the port number to listen on, each webserver configuration file must also specify which documents on the system will be allowed to be served by this instance by mapping external names to internal directories.
This is done on an Apache configuration with
Alias and
AliasMatch directives and on an Original server with
Pass rules. You supply the prefix for the final URL you want to appear with the documents found on the search. Configurations will require other directives and rules for authentication.
Apache:
Alias /clothing/
/html/products/consumer/clothing/
Original:
Pass /clothing/* /html/products/consumer/clothing/*
Both examples indicate that a URL such as
http://myserver.com/clothing/jacket.html will be accepted as valid since it matches the URL template
/clothing/. Furthermore, it indicates that the actual file that is being requested is physically located in the file system at
/html/products/consumer/clothing/jacket.html.
Notice that if the browser user had tried to access the physical file directly using the URL
http://myserver.com/html/products/consumer/clothing/jacket.html, it would have been rejected because it did not match the URL template
/clothing/. This is another reason why the URL Mapping Rules File is often necessary. It tells the Webserver Search Engine how to transform a physical document name into a URL that will match an acceptable template in one of the rules specified in the desired webserver configuration.
If you want to build and search an index on one of your own directories, you must make sure you have a webserver started that has the appropriate directive or rules to allow documents in that directory to be served, and you must build a URL Mapping Rules File associated with that configuration. This URL Mapping Rules File is named by default the same as your index with the extension MAP_FILE and is stored in the same directory as your index. If you do not use this default for the mapping rules file name, you must set the mapping rules file name in the sample search macro.
When a search is done with no mapping rules file specified, the Webserver Search Engine looks for a file named
index_directory/index_name.MAP_FILE. If found it is used as the URL mapping rules file. If not found, no mapping will occur and documents' physical location will be returned to the browser
If the mapping rules file name is different than the default,
index_directory/index_name.MAP_FILE,this information will have to be supplied in the search macro you modify for your location.
That's all there is to it. The Webserver Search Engine now has all the information it needs to properly form URLs for all the documents in its index.
Advanced information on the URL Mapping Rules File
Use of a separate URL mapping rules file provides the iSeries Webserver Search Engine with the following capabilities:
- Multiple servers can be used to display search information
- The webserver that runs the search can be different than the server that serves the resulting pages
- Documents from different directories can be stored in a single index and served by separate webservers.
- Indexes do not have to be rebuilt if Original server request routing rules or Apache Alias directives are changed. Only the mapping rules file needs to be rebuilt.
The URL Mapping Rules File is simply a text file stored in the file system. It has the following format whether it is created using an Original or Apache server. On an Apache server, the Alias and AliasMatch directives are used to create PASS entries in the mapping rules file.
ROOT
[Prefix to use for URL address]
PASS
[URL template]
[Replacement file path]
PASS
[URL template]
[Replacement file path]
When you build a mapping rules file, you specify the name of a webserver. The required information is extracted from the configuration associated with the server. This configuration is scanned for any Alias or AliasMatch directives for an Apache server or Pass rules for an Original server. Appropriate PASS statements are subsequently copied to the mapping rules file. Also placed in the mapping rules file is the root prefix to be used for each generated URL. This is best described using an example.
Consider an
original HTTP server with a configuration file named SAMPLE1 that contains the following pass rule:
Pass /clothing/* /html/products/consumer/clothing/*
Pass /shoes/* /html/products/consumer/shoes/*
. . .
If you use an
Apache server, the same directives, for example, would be an Alias type.
If you build a mapping rules file based on either an Original or Apache server and specify a Prefix to use for a URL address such as
http://myserver:8080, the resulting mapping rules file will look as follows:
ROOT
http://myserver:8080
PASS
/clothing/*
/html/products/consumer/clothing/*
PASS
/shoes/*
/html/products/consumer/shoes/*
Notice that each keyword and string is on a separate line for ease of processing. During a search operation that specifies this mapping rules file, processing takes place as follows: Let's say a document satisfying a search is physically located in
/html/products/consumer/shoes/wingtips.html. The first PASS statement is examined. The document name is compared to the second string in this directive:
/html/products/consumer/clothing/*. The document name does not match, so the next PASS statement is examined. The document name is compared to the second string in this directive:
/html/products/consumer/shoes/*. In this case the document name does match so the information before the * is replaced with the URL template, resulting in the following URL:
/shoes/wingtips.html. To complete the URL the string following the ROOT keyword is used as the prefix. The resulting URL is then
http://myserver:8080/shoes/wingtips.html which will be successfully served by the SAMPLE1 configuration.
If you do not specify a prefix to be used for URL addressing, the string following the ROOT keyword will be blank and nothing will be added as the prefix to the URL. This is called relative addressing. In this case, the web browser will itself add the root as the prefix of the URL based on the root of the current page.
Relative addressing will only work if the webserver instance to run the search is also the instance that will be used to serve the pages.
It is possible to have multiple ROOTs in one URL mapping rules file. This is necessary if some of the documents that satisfy a search need to be served by a different webserver. For example, some documents may be served without encryption, while others require the documents to be encrypted via SSL before they will be served. A URL mapping rules file that allows this might look like the following:
ROOT
http://myserver
PASS
/clothing/*
/html/products/consumer/clothing/*
PASS
/shoes/*
/html/products/consumer/shoes/*
ROOT
https://myserver:443
PASS
/prices/clothing/*
/html/products/consumer/prices/clothing/*
PASS
/prices/shoes/*
/html/products/consumer/prices/shoes/*
With this URL mapping rules file, a document named
/html/products/consumer/shoes/wingtips.html will be served by the port 80 webserver configuration as
http://myserver/shoes/wingtips.html. A document named
/html/products/consumer/prices/shoes/wingtips.html will be served by a webserver listening on the SSL port (port 443) as
http://myserver:443/prices/shoes/wingtips.html.
This URL mapping rules file can be built by selecting the Build URL mapping rules file option multiple times. Each time select the
Append to the existing mapping rules file to append one webserver configuration's rules onto another.
|