[Looking for Charlie's main web site?]

Monitoring ColdFusion web server connectors, more on Tomcat 'Status Workers'

If you're running CF 10 or above, there was a very interesting post on the Adobe CF blog, from July 19 2015, entitled, Configuring Status Worker in Connectors. (If that URL fails, here's a link via the archive.org site.) The Adobe blog post title may not have caught your attention, but it's about setting up a lightweight and built-in Tomcat monitoring feature for observing the status of the Tomcat web server connector.

You may want to consider enabling it, but I would add some caveats and observations that I share below. Note that it's really quite easy to enable, and DOES NOT require a restart of CF (only of your web server, or technically in IIS, the application pool/s) to take effect.

If you've not yet read their blog entry, go check it out and then come back here for several observations I have to share, some of which I think you'll agree could be very important. (BTW, if you don't follow that Adobe CF blog regularly, you really should. Often great content, and very little "noise".)

What is the web server connector and why should I care to monitor it?

It may help some readers to explain that the web server connector is the means by which requests get from your web server (IIS, Apache, etc.) to CF. In CF9 and earlier, it was a JRun connector. In CF10 and above, it's a Tomcat connector.

While for most folks the connector is something they may never know or worry about, for others it's been a bane of their existence especially since moving to CF10 or 11 (for reasons outlined in comments in other Adobe blog posts, like ColdFusion 11 IIS Connector Tuning.)

image of status worker monitor pageAnd so this "status worker" lightweight monitoring feature could be helpful for everyone running CF10 and 11 to consider. I show here a portion of its report, as enabled on my own site just now.

(Let me also clarify, at the outset, that this Status Worker is not at all a substitute for a full-fledged request monitoring tool, like the CF Enterprise Server Monitor, FusionReactor, or SeeFusion. It does NOT list all running requests, like those do. As you can see, the report on the right does not list running requests, though it does list at least a count of running requests. I show one in the image, indicated by the "busy" count showing "1", discussed further below.)

This "Status Worker" is a Tomcat feature, not an Adobe creation

As another point of clarification, and as Chinoy does indicate, this "status worker" is not an Adobe creation. It's a built-in Tomcat feature. It's just not one that Adobe had mentioned before. You could have found it yourself in the Tomcat docs for it, which he helpfully points.

(Indeed, the Tomcat web server connector is used by default for Lucee/Railo or indeed any Tomcat implementation which uses the AJP connector, and this status worker feature would apply to you. If you instead use the BonCode connector with Lucee/Railo or CF, then you are not using the Tomcat-provided connector and this status worker concept does not apply to you, I would think.)

Some challenges to beware

Still, even upon reading either just the Adobe blog, or indeed those Tomcat docs, you may find some challenges when trying to implement and understand the status worker, despite Chinoy's helpful explanation. I wanted to highlight my experience resolving some of those challenges here.

1) Don't leave it editable and/or unsecured

If this will be a server on the internet (or indeed, accessible to an intranet where someone could want to cause trouble), then please heed his warning to enable the read_only=True option when enabling/configuring it.

If you do not, then by default ANYONE who reads the Adobe blog post could then try their proposed /cf/status URL against your server and (assuming you have configured it per those steps) they will be able to not only MONITOR your web server connector status info but more important they will be able to change the settings in your web server connector.

Some may note that "these changes don't change the configuration permanently, they only last until the next restart of the connector", and that is true. Still, someone could cause trouble for you changing some of the connector settings on-they-fly while your server is running.

As the Tomcat docs note, another option to consider is to use web server configuration features (IIS, Apache, etc.) to control who can access this /cf/status URL. For instance, with either Apache or IIS, it would be easy to use their available URL Rewrite feature, to look for a pattern /cf/status*, and "reject" the request if the {REMOTE_ADDR} "does not match" a given IP address (and you can repeat that condition again to name another remote_addr).

(And of course, you could use a different URL than /cf/status, but beware the oft-made warning about "security by obscurity".)

The wiser plan is to ALWAYS set this read_only option on, by default, if enabling this status worker. If you need to make changes, change them in the config files and restart the worker (restart Apache or IIS, or in IIS you could also just recycle the app pool associated with the connector to restart it.)

2) You need to configure the worker for each connector, if you have more than one

Note that this status worker feature is indeed enabled PER CONNECTOR (and I discuss later how you view it PER WEB SITE).

But my point here is that if you have configured more than one connector (if you have multiple numbered folders under [coldfusion10|11]\config\wsconfig), and you want to have a status worker working for each, then you need to specify this configuration (in the files Chinoy showed) within each connector folder. If you get a 404 error trying to access the /cf/status URL, this is probably your problem. (You can get that also, though, if you just forgot to restart the web server/recycle the app pool for the site in question.)

The way you would get to each status worker is to visit the site's URL by whatever way your web server's configuration defines a binding (for a domain/ip address and/or port), and then add the status URL you setup (like /cf/status).

Again, I'll have more to say on this in a moment, but if you had multiple web sites sharing one or multiple connectors, you would use the /cf/status URL to view the monitor for each site's domain/ip/port to see the status of connections from the web server to CF for that site. (Note that it's totally ok for the worker name and indeed the URL pattern to be the same in multiple connectors. Connectors are independent from each other.)

BTW, the "port" that he refers to (for accessing the status worker display) is indeed the port (if any) that you would use for requests made to your web site which uses the connector configured to be monitored. It does NOT refer to the AJP port, like 8012 or 8013 or 8014, as you may see listed in the workers.properties file. Nor does it refer to ColdFusion's internal web server port, like 8500, if you may have that enabled. This web server connector has nothing to do with using CF's internal web server. It's only how your external web server, like IIS, Apache, or nginx are connected to CF.

3) Each status worker reports on connections for the one site through which you're viewing it, not all connections from all sites

(I have revised this since originally writing the blog post, to really make things clear.)

This is very important to understand: each status worker report will only report on connections to THAT one specific site through which you're viewing it. It does NOT report on requests going through any another site and/or connector.

So yes, this means that for you to know the status of connections against ALL your sites you'd need to run this tool AGAINST ALL YOUR SITES, one at a time.

Being able to see all the connector values for all sites at once could be valuable, because you may be having a problem with one site's use of connections and not another!

Another benefit would be to know how many connections are being used across ALL sites at one time, as would be counted against the total connection_pool_size for that connector, and the maxthreads defined in CF.

Sadly, there is no mechanism in this Tomcat-provided tool to get one interface that reports for ALL web sites. Nor is there any logging of this connection info, at all. The closest we have is the metrics log, about which I will blog more later.

[I will note, though, that if you have your web site bindings setup so that a given web site does respond to multiple domains and/or subdomains, then you only need to view the status worker for one of those sites, as it's per SITE not per domain within a site, technically. And FWIW, the status worker does not reflect any requests made using ColdFusion's built-in web server, such as may have enabled to process requests for the CF Admin, for instance.]

As for a means to automatically monitor the status worker, I will also note that there is an option in the status worker to cause its output to be rendered as XML. One could create a mechanism to track that information programmatically, though it's beyond the scope of this post to explore that further.

New/Updated 4) Beware that the status worker information only reflects the status of requests made SINCE the connector was last restarted (IIS App pool was restarted, for instance)

This is a new point I am adding, after originally writing the post. I have come to observe that the information displayed in the status worker report is REFRESHED (wiped out, starts over) if the connector is restarted. In the case of IIS, that's if the application pool that underlies the site in question gets restarted/recycled. The numbers are of course also reset if the web server (IIS, Apache, nginx) is itself restarted.

In the case of IIS, note also that application pools can be be recycled not only manually (such as by right-clicking on one in the application pools list), but they can recycle on their own. For instance, by default app pools recycle after 20 minutes of inactivity, and every 1740 minutes (1 day and 3 hours)!

My point is that if you are viewing the output of the status worker, it is NOT necessarily cumulative over a very long period of time. It could reflect only minutes worth of information, even though CF has not itself restarted.

5) Beware that some status worker operations may hangup the connector momentarily

Note that some status worker operations, like the edit operation (to display the connector options which can be edited, as discussed above), may LOCK UP the connector briefly. This affects ALL requests using the connector, including from other users.

So if you see your status worker web request hanging up (which could be a few seconds), beware it may not be only YOUR browser (making that request) which is hung up: it may be the entire connector that's hung up, and other requests made by other users on the same connector will be hung up, again perhaps only momentarily.

I've only observed this with the initial display of that "edit" operation in the web interface, and only for at most a few seconds. Still, do beware. It's easy to think it was "only your request" that was hungup. I confirmed this while observing multiple concurrent requests.

6) The Status Worker info is helpful, but may not help solve all connector problems

The information reported by this Status Worker mechanism may be helpful for some problems, so do check it out. But it's not clear to me if it will help us all to understand/resolve problems related to proper configuration of the connector, as have been discussed in the Adobe blog entries like the one I mentioned before, at http://blogs.coldfusion.com/post.cfm/coldfusion-11-iis-connector-tuning.

My observation of things is that the "count" info (such as the counts of how many requests are "busy", or "connected") are the same information that can be seen in the metrics.log (new in CF10 and 11), assuming it's properly configured. (I've done a separate post on that, where I report how the port in the CF Admin Debugging Output page setting for this should be the connector/AJP port, at least if most requests to CF do connect through an external web server. Sadly, the CF Admin defaults to using CF's built-in web server port, 8500.)

Anyway, I can report that the "busy" value (in the status worker report, and the metrics.log) does reflect the number of currently running requests (if any), if you do have long-running CFML requests (running in that site and connector). That's interesting, but remember again that this only reflects requests in THAT site and connector. If you have requests running against OTHER sites and/or connectors, this total will NOT equal the TOTAL number of running requests in CF (which IS reported in the metrics log, CFSTAT, and tools like the CF Enterprise Server Monitor, FusionReactor, and SeeFusion).

The "connected" ("Con") value (in both) is more interesting. It seems to reflect the count of connections, which do indeed live on beyond the life of a request (and can be terminated after being inactive for a while by changing the connection_pool_timeout feature discussed in the Adobe IIS connector tuning blog entry mentioned above, or again are terminated if the web server or app pool in IIS is recycled.) That number may be helpful for knowing when the number of connections is being starved (also discussed in that IIS connector tuning blog entry, but again it's a challenge that this worker only reports for the one site against which you're running it.)

The "state" value may be useful, if it's ever showing other than "OK". The legend indicates that we may want to watch out for a state value of "OK (busy)" (all connections busy), or perhaps also "ERR" (error, with possible sub-"states"), or maybe even "OK (idle)" (no requests handled). More on these in the Tomcat status worker reference docs.

Just remember, again, as discussed above, the info in this status worker report is only for requests against THAT site and reflects information since the web server (and/or IIS app pool underlying that site) was last restarted.

7) The Status Worker can help manage Tomcat/CF clusters

One more aside about the status worker: while my focus here has been on using it for monitoring and troubleshooting, it's worth noting that if your connector has been configured for load balancing, then the status worker also both provides additional information about the cluster but also adds features (in the web UI) to manage the cluster, including taking instances (workers) out of the cluster. Again, this is beyond the scope of this post, but see the discussion of options like "edit" and "recover" in the aforementioned Tomcat status worker reference docs.

8) Monitoring the connector with JMX

Before concluding, let me note that another tool which could help with monitoring the connectors is the underlying Tomcat JMX beans. I hope to someday do a post on them, and how to get access to them, and hopefully also how to log them to watch changes in monitored values over time.

9) Conclusion

So the status worker is an interesting Tomcat feature, which CF users may want to leverage for monitoring the web server connector, which handles communications between the web server and CF. It can help track how the connector is being used, and it may help understand configuration issues regarding the connector.

I'd love to hear if others have success using the status worker, and for sure if I get to work with someone with the problem and we discover something using the status worker, I'll update here or create a new post.

Hope all that's helpful.

Comments
Thanks for fleshing this out, Charlie. Most helpful!
# Posted By Paul | 8/3/15 12:04 PM
And thank you, Paul, for that. (Sorry for the delay in acknowledging that.)
# Posted By charlie arehart | 8/21/15 9:27 AM
Charlie, this might seem like a stupid question, but I want to make sure I get this right because we are having some issues with our services. On the timeout settings, in their blog, Adobe says: Connection pool timeout: - This setting determines the timeout value (in seconds) for idle connections in connection pool. This value must be in sync with the connectionTimeout attribute of your AJP connector in Tomcat's server.xml.

I guess our setting is not correct because the timeout setting in wsconfig's configuration files is set to 60 seconds and I see connectionTimeout="20000" in the server.xml file. Should I set connectionTimeout="60000" in the server.xml file to match wsconfig? Or should I change the value in wsconfig configuration files to be 20 seconds instead of 60? Or does it even matter? Thanks
# Posted By Pawel | 11/25/15 9:37 AM
Hi, Pawel. I will say that at least according to Adobe, you would want to set the times equal, and yep, the one file is in seconds and the other is in milliseconds. So you ought to change one or the other. (Anything is better than the default, of 0, which means there is no timeout, and idle connection threads can live indefinitely, which can be bad under certain circumstances.)

Now, I say above that "according to Adobe" you should make the change, but you ask if it "really matters", and in fact I would note that many people have by mistake made the change only to the property file of the connector and not remembered to change the server.xml. Or they may make the change in both, but then "reconfigure" their connector and forget to put the timeout change back in the property file.

In both cases, I have not found it to be the cause of any significant problem, myself. Could it cause a problem, if they are not in sync? I suppose. I'm just saying that I don't think the implications are well-understood. It would be something useful to better understand, for sure, for everyone's sake.

But bottom line for you: just make them be in sync. :-)
# Posted By Charlie Arehart | 12/2/15 11:21 AM
Thanks Charlie, this is very helpful. I ended up going with the below configuration after I re-read Adobe's article and updated our documentation to check the server.xml setting after reconfiguring the connectors (I didn't even think about that before your comment).

server.xml
<Connector protocol="HTTP/1.1" port="#" redirectPort="#" maxThreads="500" connectionTimeout="60000"/>

And the connectors:
worker.cfusion.connection_pool_timeout=60
worker.cfusion.max_reuse_connections=250
worker.cfusion.connection_pool_size=500
# Posted By Pawel | 12/2/15 11:27 AM
Yep, that has the timeouts in sync. Just be sure to check that workers.properties file in all the connector folders (the numbered ones under config\wsconfig), as the timeout in all of them should be the same, to be in sync with that server.xml setting.

That said, the other settings, like max reuse and pool size may well vary between the connectors (or not), and I would note that no one can say from the outside whether the particular values one chooses would be "good ones" or not, as there are too many variables.

Again, someday I want to blog about that with more observations. :-) For now, you're at least set with respect to the timeouts being in sync between this connector and CF.
# Posted By Charlie Arehart | 12/2/15 11:37 AM
Great. Thanks again
# Posted By Pawel | 12/2/15 11:44 AM
Anybody have a copy of the Adobe Blog entry referred to hear? Looks like it was removed at some point.
# Posted By Trevor | 3/12/18 1:52 PM
Yes, Trevor. You can find it in the web archive here:

http://web.archive.o...://blogs.coldfusion.com/post.cfm/configuring-status-worker-in-connectors

And since that will likely break in this blog software's reformatting of the URL, here's a bitly link:

http://bit.ly/cftomc...

For more on that archive, and how to leverage it for a question like this, see my post here:

http://www.carehart....

As for this blog post being missing from the Adobe blog, that is indeed very odd. I don't know of it being intentional. I will ask some folks to see if they may reply about this (or put it back. They may do neither, of course).
# Posted By Charlie Arehart | 3/12/18 1:55 PM
FWIW, the Adobe blog post was restored finally today.
# Posted By Charlie Arehart | 5/16/18 10:40 PM
Copyright ©2018 Charlie Arehart
Carehart Logo
BlogCFC was created by Raymond Camden. This blog is running version 5.005.
(Want to validate the html in this page?)

Managed Hosting Services provided by
Managed Dedicated Hosting