Dispatcher Concepts

This section describes how to use the various areas of the PortalProtect Dispatcher, and gives various examples on how to use it.

Load Balancing

You can use the Dispatcher for load balancing between multiple application servers running the same version of an application.

This is done simply by adding more servers to the “targetservers” or “alternateserver.xxxx.targets” configuration entry – see the dispatcher configuration section for details on how to do it.

The Dispatcher will load balance requests from new users between the defined servers – once a user has gotten a server assigned to him, all future requests from that user will reach the same server while it is up – if it goes down, another server will be selected, and all future requests will go to that one.

Clustering

The dispatcher is easy to cluster, simply set up multiple dispatcher servers (they do not need to know about each other, but they do need to share the same application servers in the configuration).

In front of the dispatchers, set up a load balancer, preferably a hardware load balancer with built-in support for hardware SSL handshaking. Set the load balancer to use random or round-robin balancing, there is no need to set it up for session affinity, such as by IP address or similar, since the dispatchers will make sure that the same user reaches the same application server, regardless of which dispatcher receives the request.

The dispatcher will do load balancing and failover towards the configured application servers. Each configured server has a logical name, which will be placed in a session cookie sent to the client – when a dispatcher receives a new request, it will forward it to the appropriate application server, if it is currently up. If it is down, another server will be selected for the user.

New users will be sent to a random server, selected from the list of application servers that are currently responding to requests.

Reasons for clustering the Dispatcher include both failover and performance – especially if the Dispatcher has to handle software SSL, the CPU overhead can be enormous (a quick rule of thumb says that you need 5 times the number of servers if you run SSL, than the number you need to run unencrypted). Note that the SSL overhead can be greatly reduced by using dedicated hardware to handle the SSL handshakes.

Ping URLs

When setting up load balancing between multiple servers, it is very important to have a correctly functioning ping URL defined – this URL is periodically requested by the dispatcher to verify that the application is still up. If it returns anything other than the HTTP response code “200 OK” then the server is assumed to be down, and it will no longer receive new requests.

If all servers are down (no matter if the ping URL returns something other than “200 OK” or if the dispatcher cannot connect to the server at all) then a random server will be selected for each new request in the hope that it will be able to respond.

 Usually, you will set the ping URL to point to a servlet that checks backend database connections and other resources, and only returns “200 OK” if it is able to serve requests successfully – this means that if you have multiple mirrored servers, and one loses the connection to its database, then the other servers will receive all the requests until the connection is up again.

Dispatching to Multiple Applications

It is quite easy to dispatch requests to multiple different applications – the ones defined in the “targetservers” can be considered to be the default application – i.e. if no other defined rules selects another server, then the request will go the server defined in “targetservers”.

 Basically, you define a number of “alternateserver” entries that point to one or more mirrors of other applications, and then you use the “serverrule” configuration to specify in which cases requests should go to one of the alternate servers instead of the default one.

The serverrule can be used to match the URL, scheme (http/https) or any of the HTTP headers in the request, so if e.g. the request arrived to ebusiness.portalprotect.com then it gets forwarded to alternate server 1, but if the request arrived to insurance.portalprotect.com then the request is forwarded to alternate server 2.

Refer to the configuration quick guide section, as well as the dispatcher configuration for details.

 By setting up consul (www.consul.io) you can ask PortalProtect to query it periodically to dynamically change the list of available application servers by adding and removing services within consul. This is useful for scaling up or down in cloud services.

Do this by using the macro ${consul:/consul_url} in the “targetservers” configuration – e.g. ${consul:/v1/catalog/service/appservers}

Protecting URLs

There are a couple of different ways of protecting URLs – the Dispatcher will always call isURLAllowed() in the Agent, and its behaviour can be controlled 100% by the implementation in the Validator plugin within the Agent.

 As parameters to each call, an identifier and the URL will be passed – note that the URL does NOT contain the host part, just the first slash and the part after it, excluding any parameters.

The identifier passed contains either the word “default” or the name of the alternateserver entry that the URL will be passed to, so if you e.g. have defined an alternate server called “alternateserver.application1.targets=xxx” (see the Dispatcher configuration for details on how to do it) then the identifier will contain the value “application1”.

The Validator plugin can do what it wants to check if the current user has access to the URL, but usually it will pass the identifier to an authorization plugin (by calling Agent.getInstance().getProtectedURLs() ) which will ask the authorization plugin which URLs are protected for that specific identifier.

If that kind of protection is not enough for your needs, you have the option of implementing a plugin to the dispatcher which will intercept all calls, and have the option to do custom checking based on all parts of the request. It basically acts like a filter that gets access to all data about to be sent to the application, before it happens.

URL Rewriting

By adding URL Rewrite rules, you can rewrite URLs before they are otherwise processed.

Rewriting directive syntax is similar to Apache mod_rewrite module.

Directives are added to the dispatcher configuration by adding entries called urlrewrite.xxxx where xxxx is a number between 1 and 512 which decides the order the directives are processed in.

You can find more information and examples at http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html

And a manual at http://www.tuckey.org/urlrewrite/

Note that the manual above mentions configuring the URL rewriting rules using WEB-INF/web.xml and urlrewrite.xml – instead of this, the configuration is done using the PortalProtect configuration.

Supported directives are:

  • RewriteLogLevel
  • RewriteLog
  • RewriteEngine
  • RewriteCond
  • RewriteRule

Restrictions and limitations:

Attribute

Explanation

RewriteLogLevel

Specified as int, lean-ups as: <= 1 – FATAL, 2 – ERROR, 3 – INFO, 4 – WARN, >= 5 DEBUG

RewriteLog

SYSOUT, SYSERR, log4j, commons (if not set log4j logging will be used)

RewriteRule

Supported but note:

Certain flags not supported:

  • Proxy flag [P] not supported.
  • chain flag [C] not supported
  • env flag [E] not supported
  • next flag [N] not supported
  • nosubreq flag [NS] not supported
  • qsappend flag [QSA] not supported
  • Skip flag [S] not supported

RewriteBase

Not supported

RewriteLock

Not supported

RewriteMap

Not supported

RewriteOptions

Not supported

Examples

This as an example of a few rules as specified in the configuration:

<property name="urlrewrite.1" value="RewriteCond %{HTTP_REFERER} ( lean-) [NC,OR]" description=""/>
<property name="urlrewrite.2" value="RewriteCond %{HTTP_REFERER} (xxx) [NC]" description=""/>
<property name="urlrewrite.3" value="RewriteRule .* - [F,L]" description="If  lean-up URL contains  lean- or xxx, deny the request"/>
<property name="urlrewrite.4" value="RewriteRule ^/google/(.*)$ http://www.google.com/search?q=$1 [R]" description="Redirect links to /google/xxxxxx to google search with xxxxxx as parameter"/>

his is a slightly more readable version of the same rules:

RewriteCond %{HTTP_REFERER} (lean-) [NC,OR]
RewriteCond %{HTTP_REFERER} (xxx) [NC]
RewriteRule .* - [F,L]
RewriteRule ^/google/(.*)$ http://www.google.com/search?q=$1 [R]

In this example, there are 2 rules, where the first rule has 2 preconditions required for it to be executed.

If the HTTP Referer header contains the string “lean-” (case-insensitive comparison) or the string “xxx” then the rule is executed, and the rule [F] causes HTTP status 403 Forbidden to be sent back to the browser.

The 2nd rule has no preconditions, it the URI starts with /google/ then the request is redirected to http://www.google.com/search?q= with the rest of the URI appended as query parameter to google.

HTTP Session Keepalive ping

The dispatcher has a feature called “HTTP Session keepalive ping” which allows shared timeouts between applications and portalprotect.

Essentially, every session has a different timeout, meaning PortalProtects session has a timeout, and each application server that PortalProtect forwards a request to might or might not have its own HTTP session with its own timeout.

This has the potential problem of handling timeouts. If e.g. a site has 3 different applications, the user might be active first on application 1, then 2 and then 3 – while he is actively working on application 3 all requests go through portalprotect dispatcher so both application 4 and portalprotects session timeout is reset whenever the user performs an action.

But, application 1 and application 2 is unaware of this and their HTTP session might timeout even though the user is still active and working with application 3.

If this is a problem, PortalProtect has a feature to keep the session alive on the inactive application server that the user has previously visited in the session.

Requirements

It requires that PortalProtect is configured to catch session cookies using the cookiesToHideFromBrowser property on the dispatcher. This will cause PortalProtect to store e.g. the JSESSIONID and ASP.NET_SessionId cookie values in the portalprotect session and hide them from the browser but add them to each request to the server so the application server will not know the difference, but the browser will only see a single session cookie, namely portalprotects session and not any individual http session cookies from the backend application servers.

When using multiple groups of dispatchers, each separate group needs a unique value of “dispatchergroup” configured – this is required so the session controller which controls the frequency of the pings can detect which dispatchers to ask to ping which applications.

How it Works

Periodically (configured by “httpSessionKeepAlivePingInterval” the session controller will enumerate all sessions for identified users owned by itself (in a cluster, each session controller “owns” a portion of the sessions).

The session controller will then query all running dispatchers for their dispatchergroup and generate a list of available dispatchers for each group.

The dispatcher has previously stored information in the session about which applicationserver (in a cluster) was selected for a particular user, and which dispatchergroup the applicationserver belonged to – the session controller will now sort this out so it ends up with a list of applications within a list of dispatchergroups that need to be “pinged” to keep the http session alive.

It will then divide this into chunks of a few thousand users and divide the work of actually making the requests up between any available dispatchers within a dispatcher group.

When a dispatcher next receives a notification that it should ping the application servers on behalf of the identified sessions, it retrieves the previously stored cookies (if present) from the session and “pings” the application server previously assigned to this user by sending an HTTP GET request to it. This causes the application server to reset its idle time and the end result is that as long as a user is active on a single application, all applications HTTP sessions are kept active, thus timeout is shared across all.

How to Enable it

  1. Configure “cookiesToHideFromBrowser” with the names of the session cookies you wish PortalProtect to catch.
  2. Make sure that “dispatchergroup” property is set to a value unique for each group of dispatchers if you have more than one group.
  3. Set “saveSelectedServerInSession” property to true on dispatchers, or the information needed by the session controller will not be available.
  4. Configure “httpSessionKeepAlivePingInterval” for the session controller and set it to the number of seconds between each “ping”. You should use a value lower than the http session timeout, but keep it relatively high to keep traffic low. E.g. if the http session timeout is 20 minutes, set this to 900 seconds = 15 minutes.
  5. If you do not wish to use the URL configured in “pingurl” but require a separate url for this, you need to configured it in the “httpsessionkeepalivepingurl” property.