[Looking for Charlie's main web site?]

CF911: Have you updated your #ColdFusion JVM to _24 yet? Important security fix for CF 8/9

This isn't new info, but you may have missed it. If you're running CF 8 or 9, did you know you can and should update the JVM that came with it? And that you have Adobe's blessing to do this update? This is because of a serious bug in the JVM that is not fixed until 1.6.0.24.

Both CF 9.0 and 9.01 run on older JVMs (and therefore need this update). And are you on CF8? You're not left out: Adobe even has confirmed this update can be applied to CF 8 and 8.01, too!

Update: Since posting this note some have kindly pointed out that Java has been updated beyond update 24, and indeed to Java 7. I did realize that, but didn't mention it simply because u24 is all that Adobe has (for now) certified CF to support. Still, for the sake of completeness, I want to clarify this now and I'll address it more in the final section below.

Old news, but not everyone knows

Someone mentioned today on a list that they'd seen news from Oracle of this important bug (and fix) for the JVM, which can cause someone to crash the JVM using a particular URL string. He also noted that while Oracle had released the fix some months ago, he lamented that "This issue doesn't seem to have gotten too much interest from Adobe."

I explained that in fact Adobe had addressed it, also some months ago.

Adobe's response

Adobe offered a technote on the issue in March 2011.

In fact, they not only confirmed support for (and recommended updating to) the fixed JVM version, 1.6.0_24, but they even (for the first time in years) approved updating to this JVM version for CF 8 and 8.01. Since those ran on very old JVM releases, which had problems not fixed until JVM update 10, this was really good news.

But as his note conveys, word just had not gotten out as much as it could. Beside his thread on that list, I hope now that this blog entry will help reach more people.

More about the bug

If you want to know more about the bug from a CF perspective, check out this blog entry (from several months ago by David Stockton, of cfconsultant.com). He explains the problem/risk as well as shows how to cause the problem (to confirm when it's fixed), as well as some useful extra info on using FusionReactor to help diagnose it.

How to update the JVM for CF

So if you're now persuaded, you may wonder how you go about updating the JVM. You may fear it's like doing open-heart surgery. It's more like getting a mole removed. :-) It could go wrong, but will barely be noticeable if done right.

There are many blog entries that walk through the few simple steps. Find out more from Ryan Stille, Mark Kruger, and Adobe, to name just a few.

While it's not too hard to do, there are just a couple of potential gotchas: be sure to get the JDK not the JRE, and pay attention to the special path format that's needed when pointing to the JVM on Windows.

So if you're on CF and have not yet updated the JVM, seriously consider it.

What about updates beyond u24?

This is an update since I posted the entry above. As you'll see in the comments below, some folks were kind enough to write in and point out that there have been JVM updates since u24.

I did realize that, and do appreciate their wanting to share the info. I'd not mentioned updates beyond 24 simply because I would not propose (myself) that anyone now update beyond the release for which Adobe has certified CF. (At least not until compelling reasons arise and substantial community testing suggests it may be safe, as happened with CF8 and the u10 update. Even then, you are risking being out of bounds for support by Adobe, so should make such a change with caution.)

The whole point of this entry was that u24 HAD in fact received Adobe's blessing, for CF 8 and 9, which was pretty compelling and seemed to have been missed by some when it happened earlier in the year.

Anyway, one more tip: in case anyone may find it hard to locate the older updates (like u24) on the Oracle site, here is a link (which works, at least, for now):

http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase6-419409.html

Again, be sure to get the JDK (development kit) not the JRE (runtime environment) among the options listed there.

Some code to throttle rapid requests to your CF server from one IP address

Some time ago I implemented some code on my own site to throttle when any single IP address (bot, spider, hacker, user) made too many requests at once. I've mentioned it occasionally and people have often asked me to share it, which I've happily done by email. Today with another request I decided to post it and of course seek any feedback.

It's just a rough cut. I haven't thought it through thoroughly (wow, how's that for an alliteration!). Still, while I know there are couple of concerns that will come to mind for some readers and I try to address those at the end, it does work for me and has helped improve my server's stability and reliability.

Background: do you need to care? Perhaps more than you realize

As background, in my consulting to help people troubleshoot CF server problems, one of the most common surprises I help people discover is that their servers are often being bombarded by spiders, bots, hackers, people grabbing their content, rss readers, or even just their own internal/external ping tools (monitoring whether the server is up.)

It can either be that there are many more than they expect, coming more often than they expect, or they may come extremely fast to your server (even many times a second). This throttle tool helps deal with the latter.

Why you can't "just use robots.txt and call it a day"

Yes, I do know that there is a robots.txt standard (or "robots exclusion protocol") which, if implemented on your server, robots should follow so as not to abuse your site. And it does offer a crawl-delay option.

The first problem is that some of the things I allude to above aren't bots in the classic sense (such as RSS readers, ping tools). They don't "crawl" your site, so they don't regard that they need to be told how/where to look. They're just coming looking for a given page.

The second problem is that some bots simply ignore the robots.txt, or don't honor all of it. For instance, while Google honors the file in terms of what it should look at, my understanding is that it instead requires you to implement the webmaster toolkit for your site to control its crawl rate.

Then, too, if you may have multiple sites on your server, the spider or bot may not consider that in deciding to send a wave of requests to your server. It may say "I'll only send requests to domain x at a rate of 1 per second", but it may not realize that it's sending requests to domains x, y, z (and a, b, and c) all of which are one server/cluster, which could lead a single server to in fact be hit far more than once a second (in that scenario). It may seem that's an edge case, but honestly it's not that unusual from what I've observed.

Finally, another reason all this becomes a concern is that of course there can be many spiders, bots, and other automated requests all hitting your server at once sometimes. My tool can't help with that, but it can at least the other points above.

(As with so much in IT and this very space, things do change, so what's true today may change, or one may have old knowledge, so as always I welcome feedback.)

The code

So I hope I've made the case for why you should consider some throttling, such that too many requests from one IP address are rejected. I've done it in a two-fold approach, sending both plain text and an http header that is appropriate for this sort of "slow down" kind of rejection. You can certainly change it to your taste.

I've just implemented it as a UDF (user-defined function). Yes, I could have also written at in all CFscript (which would run in any release, as there nothing that couldn't be written in script in that code--well, except the CFLOG, which could be removed). But since CF6 added the ability to define UDFs with tags, and to keep things simplest for the most people, I've just done it as tags. Feel free to modify it to all script if you'd like. It's just a starting point.

I simply drop the UDF into my application.cfm (or application.cfc, as appropriate). Yes, one could include it, or implement it as a CFC method if they wished.

<cffunction name="limiter">
   <!---
      Written by Charlie Arehart, charlie@carehart.org, in 2009
      - Throttles requests made more than "count" times within "duration" seconds.
      - sends 503 status code for bots to consider as well as text for humans to read
      - also logs to a new "limiter.log" that is created automatically in cf logs directory, tracking when limits are hit, to help fine tune
      - note that since it relies on the application scope, you need to place the call to it AFTER a cfapplication tag in application.cfm
   --->

   <cfargument name="duration" type="numeric" default=3>
   <cfargument name="count" type="numeric" default="3">

   <cfif not IsDefined("application.rate_limiter")>
      <cfset application.rate_limiter = StructNew()>
      <cfset application.rate_limiter[CGI.REMOTE_ADDR] = StructNew()>
      <cfset application.rate_limiter[CGI.REMOTE_ADDR].attempts = 1>
      <cfset application.rate_limiter[CGI.REMOTE_ADDR].last_attempt = Now()>
   <cfelse>
      <cfif StructKeyExists(application.rate_limiter, CGI.REMOTE_ADDR) and DateDiff("s",application.rate_limiter[CGI.REMOTE_ADDR].last_attempt,Now()) LT arguments.duration>
         <cfif application.rate_limiter[CGI.REMOTE_ADDR].attempts GT arguments.count>
            <cfoutput><p>You are making too many requests too fast, please slow down and wait #arguments.duration# seconds</p></cfoutput>
            <cfheader statuscode="503" statustext="Service Unavailable">
            <cfheader name="Retry-After" value="#arguments.duration#">
            <cflog file="limiter" text="#cgi.remote_addr# #application.rate_limiter[CGI.REMOTE_ADDR].attempts# #cgi.request_method# #cgi.SCRIPT_NAME# #cgi.QUERY_STRING# #cgi.http_user_agent# #application.rate_limiter[CGI.REMOTE_ADDR].last_attempt#">
            <cfset
      application.rate_limiter[CGI.REMOTE_ADDR].attempts = application.rate_limiter[CGI.REMOTE_ADDR].attempts + 1>

            <cfset application.rate_limiter[CGI.REMOTE_ADDR].last_attempt = Now()>
            <cfabort>
         <cfelse>
            <cfset
      application.rate_limiter[CGI.REMOTE_ADDR].attempts = application.rate_limiter[CGI.REMOTE_ADDR].attempts + 1>

            <cfset application.rate_limiter[CGI.REMOTE_ADDR].last_attempt = Now()>
         </cfif>
      <cfelse>
         <cfset application.rate_limiter[CGI.REMOTE_ADDR] = StructNew()>
         <cfset application.rate_limiter[CGI.REMOTE_ADDR].attempts = 1>
         <cfset application.rate_limiter[CGI.REMOTE_ADDR].last_attempt = Now()>
      </cfif>
   </cfif>
</cffunction>

Then I call the UDF, using simply cfset limiter(), as shown below. That's it. No arguments need be passed to it, unless you want to override the defaults of limiting things to 3 requests from one IP address within 3 seconds.

<!-- the following must be done after cfapplication -->
<cfset limiter()>

Note that since the UDF relies on the application scope, you need to place the call to it AFTER a cfapplication tag if using application.cfm.

Caveats and more

There are definitely a few points to consider, and some concerns/observations that readers may have.

  • First, BlueDragon fans will want to point out that they don't need to code a solution at all (or use this), because it's had a CFTHROTTLE tag for several years. Indeed it has. I do wish Adobe would implement it in CF (I'm not aware of it existing in Railo). Until then, perhaps this will help others has it has me.
  • More important, some will be quick to point out a potential flaw in the approach of throttling by IP address is that you may have some visitors who are behind a proxy where they appear to your server to all be coming from one ip address. Fair enough. This is a dilemma that requires more handling. For instance, the BD CFThrottle tag implements this with a TOKEN attribute allowing you to key on yet another field in the request headers. I didn't choose to bother with that, as in my case (on my site), I just am not that worried about the problem. You may need to, so beware. Again, the log will help you determine how much it's doing any work at all.
  • And some may recommend (and others may want to consider) instead doing this throttling at the servlet filter level, rather than CFML (something I've written about before .) Yep, since CF runs atop a servlet engine (JRun by default), you could indeed do that, which could apply then to all applications on your entire CF server (rather than implemented per application like above.) And there are indeed throttling servlet filters, such as this one. Again, I offer this for those who aren't interested in that.
  • And of course, an inevitable question/concern some may have is, "but if you slow down a bots, might that that not affect what they think about your site? Might they stop crawling entirely?" I suppose that's a consideration that each will have to make for themselves. I implemented this several months ago and haven't noticed any change either in my page ranks, my own search results, etc. That's all just anecdotal, of course. And again, things can change. I'll say that of course you use this at your own risk. I just offer it for those who may want to consider it, and want to save a little time trying to code up a solution. Again, I welcome feedback if it could be improved.
  • Now, one other gotcha to consider, if you implement this and try to test it: some browsers have a built-in throttling mechanism of their own and they won't send more than x requests to a given domain from the browser at a time. I've spoken on this before, and you can read more from yslow creator Steve Souders. So while you may think you can just hit refresh 4 times to force this, it may not quite work that way. What I have found is that if you wait for each request to finish and then do the refresh (and do that 4 times), you'll get the expected message. Again, use the logs for real verification of whether the throttling is really working for real users, and to what extent.
  • There is of course another nasty effect of spiders, bots, and other automated requests, and that's the risk of an explosion of sessions which could eat away at your java heap space. People often accuse CF of a memory leak, which it's really just this issue. I've written on it before (see the related entries at the bottom here, above the comments). This suggestion about throttling requests may help a little with that, but it really is a bigger problem with other solutions, that I allude to in the other entries.
  • Finally, yes, I realize I could and should post this to the wonderful CFlib repository, and I surely will. I wouldn't mind getting some feedback if anyone sees any issues with it. I'm sure there's some improvement that could be made. I just wanted to get it out, as is, given that it works for me and may help others.
Besides feedback/corrections/suggestions, please do also let me know here if it's helpful for you.

Revisiting CF/Java integration

On a mailing list, someone asked about running/integrating Servlets, JSPs, Struts, and EJBs in CF. This is one of those topics that was discussed a lot when CFMX came out, but those who didn't switch at the time may have missed out.

I thought I'd share here my answer to his question (pointing out several resources for him to learn more), in the hope that it may help others also who may only now be considering such integration.

Since he was already familiar with running JSPs on CF, but some readers here may not be, I'll start with just a quick point about that, then I'll offer what I replied to him.

CF and Java Integration

It may be important to clarify that technically, it was CF 4.51 that first afforded the option to integrate with Java (including EJBs). Though CF then wasn't built upon Java, you could point to a JVM in the CF Admin and various CF tags and functions afforded some Java integration.

CFMX 6, however, was not only only built upon Java but the Enterprise (and Developer) edition specifically added the ability to run JSPs and servlets directly within CF. More than that, there's some significant integration possible.

In the case of JSPs, you could just drop them into the same code directory with your CFML templates. Servlets take a little more work, as explained in my reply to the gent's email, below. He had been reading a JSP/servlet book and wanted to know how to run the latter, especially, on CF, as well as how to integrate with the Struts framework:

I hope I can help and I think you'll find I have good news.

You mention looking at a book on JSPs and servlets, and you ask how to implement them (and JSPs) in CF. Of course, that book won't help with that--but neither really will the CFML Reference (or a site like CFQuickdocs), if you may have looked that. You need to look at the ColdFusion Developers Guide in the CF docs (http://livedocs.adobe.com/coldfusion/8/htmldocs/Part_4_CF_DevGuide_1.html), or any CF books out there. The CF manual has a chapter specifically on this topic: Integrating J2EE and Java Elements in CFML Applications.

For instance, that chapter clarifies that to run a servlet called HelloWorldServlet, you put the servlet .java or .class file in the [CFserver]/WEB-INF/classes directory and refer to the servlet with the URL /servlet/HelloWorldServlet. It also discusses sharing data between CFML and such JSPs/servlets. You can even use JSP custom tag libraries directly within CFML, and lots more. And yes, the docs show (briefly) how to enable EJBs and call them from CFML.

That said, the coverage in the docs may leave one wanting more, so you may want to consider other resources that discuss it more. There was at least one book focused on that, Reality Macromedia ColdFusion MX: J2EE Integration. There were also lots of talks and articles back in the 2002 timeframe, when this stuff really took off with CFMX (though Java integration was added back in CF 4.51, which added a means in the CF Admin to point to a JVM that CF would work with.)

For instance, I did lots of presentations on CF/Java integration (as did others, of course). If you visit http://carehart.org/presentations/, and search for java, jsp, or servlet filters.

Doing Struts is not discussed in the CF docs, but there was at least one DevCenter article that discussed it specifically: Streamlining Application Development Using Struts in ColdFusion MX.

It's interesting to see these recent questions about things that came out with CF 6--many shops either didn't move from 4/5 right away, or did but didn't take advantage of the new features. Folks in that position will then not have necessarily followed all the resources (books, technotes, blog entries, user group talks) that came out back then.

This is one of the reasons I keep saying that any topic on the CF Meetup is welcomed. Not everyone needs only to learn new stuff, many need to learn what may seem "old" stuff. It's also the reason why I keep pointing to articles and talks I did in the way past. :-)

Though he didn't ask about it, of course also since CF 6 you 've been able to deploy CFML as a J2EE (or Java EE) web application/WAR or enterprise application EAR. That feature has improved from 6, to 6.1, to 7 (and of course is still possible in 8).

Certainly, if you're a shop that has any Java folks--and especially if there's some strong desire to lean that way, and CF is still seen mistakenly as a proprietary island--it's important to be able to convey to them that your CFML app can be deployed as a pure J2EE web app (WAR/EAR), which is a form they'd expect.

I think all this would be a topic worth my packing together for an upcoming CF meetup session. Until then, again, anyone interested in the topics can see the resources mentioned above that I and others have written.

BlogCFC was created by Raymond Camden. This blog is running version 5.005.

Managed Hosting Services provided by
Managed Dedicated Hosting