Quantcast
Channel: Robin Moffatt - Rittman Mead
Viewing all articles
Browse latest Browse all 106

TimesTen and OBIEE port conflicts on Exalytics

$
0
0

Introduction

Whilst helping a customer set up their Exalytics server the other day I hit an interesting problem. Not interesting as in hey-this-will-cure-cancer, or oooh-big-data-buzzword interesting, or even interesting as in someone-has-had-a-baby, but nonetheless, interesting if you like learning about the internals of the Exalytics stack.

The installation we were doing was multiple non-production environments on bare metal, a design that we developed on our own Rittman Mead Exalytics box early on last year, and one that Oracle describe in their Exalytics white paper which was published recently. Part of a multiple environment install is careful planning of the port allocations. OBIEE binds to many ports per instance, and there is also TimesTen to consider. I’d been through this meticuluously, specifying ports through the staticports.ini file when building the OBIEE domain, as well as in the bim-setup.properties for TimesTen.

So, having given such careful thought to ports, imagine my surprise at seeing this error when we started up one of the OBIEE instances:

[OracleBIServerComponent] [ERROR:1] [] [] [ecid: ] [tid: e1dbd700]  [nQSError: 12002] 
Socket communication error at call=bind: (Number=98) Address already in use

which caused a corresponding OPMN error:

ias-component/process-type/process-set:
  coreapplication_obis1/OracleBIServerComponent/coreapplication_obis1/

Error
--> Process (index=1,uid=328348864,pid=17875)
  time out while waiting for a managed process to start

Address already in use? But…all my ports were hand-picked so that they explicitly woulnd’t clash …

Random ports

So it turns out that TimesTen, as well as using the two ports that are explicitly configured (deamon and server, usually 53396 and 53397 respectively), the TimesTen server processs also binds to a port chosen at random each time for the purpose of internal communication. This is similar to what the Oracle Database listener does, and as my colleague Pete Scott pointed out it’s been known for port clashes to occur between ODI and the Oracle Database listener.

To see this in action, use the netstat command, with the flags tlnp:

  • t : tcp only
  • l : LISTEN status only
  • n : numeric addresses/ports only
  • p : show associated processes

We pipe the output of the netstat command to grep to filter for just the process we’re looking for, giving us:

[oracle@obieesample info]$ netstat -tlnp|grep ttcserver
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 127.0.0.1:58476             0.0.0.0:*                   LISTEN      4811/ttcserver
tcp        0      0 0.0.0.0:53397               0.0.0.0:*                   LISTEN      4811/ttcserver

Here we can see on the last line the TimesTen server process (ttcserver) listening on the explicitly configured port 53397 for traffic on any address. We also see it listening on port 58476 for traffic only on the local loopback address 127.0.0.1.

What happens if we restart the TimesTen server and look at the ports again?

[oracle@obieesample info]$ ttdaemonadmin -restartserver
TimesTen Server stopped.
TimesTen Server started.

[oracle@obieesample info]$ netstat -tlnp|grep ttcserver
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 127.0.0.1:17073             0.0.0.0:*                   LISTEN      6878/ttcserver
tcp        0      0 0.0.0.0:53397               0.0.0.0:*                   LISTEN      6878/ttcserver

We can see the same listen port 53397 as before, listening for any client connections either locally or remotely, but now the port bound to the local loopback address 127.0.0.1 has changed – 17073.

Russian Roulette

So TimesTen randomly grabs a port each time it starts, and this may or may not be one of the ones that OBIEE is configured to use. If OBIEE is started first, then the problem does not arise because OBIEE has already taken the ports it needs, leaving TimesTen to randomly choose from the remaining unused ports.

If there are multiple instances of TimesTen and multiple instances of OBIEE then the chances of a port collision increase. What I wanted to know was how to isolate TimesTen from the ports I’d chosen for OBIEE. Constraining the application startup order (so that OBIEE gets all its ports first, and then TimesTen can use whatever is left) is a lame solution since it artificially couples two components that don’t need to be, adding to the complexity and support overhead.

TimesTen itself cannot be configured in its behaviour with these ports – Doc ID 1295539.1 states:

[...] All other TCP/IP port assignments to TimesTen processes are completely random and based on the availability of ports at the time when the process is spawned.

To understand the port range that TimesTen was using (so that I could then configure OBIEE to avoid it) I knocked up this little script. It restarts the TimesTen server, and then appends to a file the random port that it has bound to:

$ cat ./tt_port_scrape.sh
ttdaemonadmin -restartserver
netstat -tlnp|grep ttcserver|grep 127.0.0.1|awk -F ":" '{print $2}'|cut -d " " -f 1 1>>tt_ports.txt

Using my new favourite linux command, watch, I can run the above repeatedly (by default, every two seconds) until I get bored^H^H^H^H^H^H^H^H^H have collected sufficient data points

watch ./tt_port_scrape.sh

Finally, parse the output from the script to look at the port ranges:

echo "Lowest port: " $(sort -n tt_ports.txt | head -n1)
echo "Highest port: " $(sort -n tt_ports.txt | tail -n1)
echo "Number of tests: " $(wc -l tt_ports.txt )

Using this, I observed that TimesTen server would bind to ports ranging from as low as around 9000, up to 65000 or so.

Solution

Raising this issue with the good folks at Oracle Support yielded a nice easy solution. In the kernel settings, there is a configuration option net.ipv4.ip_local_port_range which specifies the local port range available for use by applications. By default this is 9000 to 65500, which matches the range that I observed in my testing above:

[root@rnm-exa-01 ~]# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 9000     65500

I changed this range with sysctl -w :

[root@rnm-exa-01 ~]# sysctl -w net.ipv4.ip_local_port_range="56000 65000"
[root@rnm-exa-01 ~]# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 56000    65000

and then reran my testing above, which sure enough showed that TimesTen was now keeping its hands to itself and away from my configured OBIEE port ranges:

Lowest port:  56002
Highest port:  64990

(if I ran the test for longer, I’m sure I’d hit the literal extremes of the range)

To make the changes permanent, I added the entry to /etc/sysctl.conf:

net.ipv4.ip_local_port_range = 56000 65535

Lessons learned

1) Diagnosing application interactions and dependencies is fun ;-)
2) watch is a really useful little command on linux
3) When choosing OBIEE port ranges in multi-environment Exalytics installations, bear in mind that you want to partition off a port range for TimesTen, so keep the port ranges allocated to OBIEE ‘lean’.


Viewing all articles
Browse latest Browse all 106

Trending Articles