[Looking for Charlie's main web site?]

CFMyths: "When I download CF to install it from scratch, it has the latest fixes/updaters"

Today I'm starting a new series on CFMyths, some common misconceptions that I find myself often helping correct on lists/forums or with my troubleshooting customers.

First myth up for consideration:

True or false: "If/when I download CF to install it from scratch, the installer has all the latest fixes (updaters, at least)"

Answer: False (generally). For instance, if you download CF9 today (Dec 2010), you still get CF 9.0, released originally in Oct 2009. You don't get the latest updater (9.0.1 as of this writing, released July 2010), though its existence is at least mentioned on the page, nor of course does it then include any hotfixes or cumulative hotfixes.

Why not, you may wonder? I'll explain more in a moment, along with more about hotfixes and updaters as concepts (and where to find them specifically, for each CF release).

About this CFMyths Series

First, I want to introduce this new series, CFMyths. Today's entry, though prompted by a question on a mailing list, is in fact a frequent source of confusion for folks. Indeed, it's one that I've covered (along with many like it) in a CFMythbusters talk that I've given at various events. Of course, relatively few see such talks, and even though I offer the details in the slides, available on my site, many never dig into those, either I think, so I've long meant to create some blog entries from them.

Today's question prompted me to go ahead and start that series. (And I know I still owe readers the continuation of a CF911 series on memory problems that I started last month. Part 2 is now written. I just needs to refine it. This topic got my attention, and once I started writing, well, here you go. :-)

"So why is the currently available installer not necessarily 'the very latest version'"?

It's about logistics, really. CF runs on many platforms (OS's, web servers, databases, and different releases of each), so it's expensive for Adobe to create new installers. This is why the base version may be left there for quite a while, with you being expected to apply any subsequent updaters or hotfixes. That will surely surprise many, but it's the way it's been for some time (except 8.0.1, when a completely new installer was offered, in addition to one to update those already on CF 8.)

So it's incumbent upon you, when you download a given release, to pay close attention to what version it is. It will be reflected first in the installer filename, such as coldfusion_9_WWE_win64.exe. It will also be listed with the specific version number as reported in the CF Admin and logs once it's installed (such as 9,0,0,251028 for the base version of 9.0, or 9,0,1,274733 for the base version of 9.0.1). You'll then need to check to see if there may be any updates, hotfixes, or cumulative hotfixes. The same is true for 8.0.1, too (whose base build number is 8,0,1,195765). I'll point you to the updaters and hotfixes in a moment.

"What are all these terms? Hotfixes, updaters, CHFs?"

Before proceeding, it may help some readers if I explain the terminology of the various updates for CF that may exist:

  • Hotfixes are individual fixes to specific problems (typically offered as a single jar file, such as hf801-76563.jar, and for which there is usually an individual technote, such as this one for that hotfix, which is a recent fix--Oct 2010--for CF 8.0.1)
  • Cumulative hot fixes (CHFs): when there have been many hotfixes, they will typically be rolled into a cumulative hotfix which includes several(such as the CHF 1 for 9.0.1, which can only be applied to CF 9.0.1, not CF9). There's no fixed number of hotfixes which will trigger Adobe to create a CHF. They are usually deployed also as a single jar, and you are expected to remove any individual hotfixes that are included. Again, the technote for each hotfix will explain these things. As the name implies, they are also cumulative, such that CHF 2 includes all the hotfixes that CHF1 included (with a caveat, discussed below). CHFs do not included new features
  • Updaters: When there have been some number of CHFs, they will often be rolled into an update/updater (like 9.0.1, 8.0.1, 7.0.2, 7.0.1, 6.1). Updaters are always executable installers, and they often do add some minor new functionality to CF.

Adobe has a page that explains more about these terms generically.

Note that when looking at hotfix/CHF filenames(such as hf801-76563.jar), you can tell by its prefix whether it's a hotfix (hf) or CHF and what version it's for (8.0.1, in that example.)

Indeed, that brings up another potential gotcha: be careful about inadvertently applying a hotfix/CHF to the wrong release, or leaving an old HF in that's been obviated by a CHF that includes it. I discuss the potential gotcha of applying the wrong hotfix/CHF for the wrong version in a blog entry, You may have mistakenly applied an 8.0 CHF on a 8.0.1 CF server, and not realize it!

Finally, the Adobe technote page about each hotfix, CHF, or updater will each explain how to apply that particular update. Again, I'll offer links to the key pages for different release updates in a moment. First, there are a couple of other common misconceptions I want to address.

Phew: I bet some readers thought all this should be pretty straightforward, and may be surprised to learn all these details. It's the kind of thing I often find myself helping my customers correct, and again it also often comes up in questions on lists/forums. You might think we've covered all there is, but sadly, there are still some more gotchas.

"At least when I apply an updater, that includes all hotfixes that preceded it, right?"

Well, not necessarily. For instance, when the CF 7 updater 2 (7.0.2) came out, Macromedia decided not to bundle an updated set of JDBC drivers in the updater. It was up to you to download and install those as a manual hotfix. I blogged about that back at the time, in 2006.

Again, the logic was understandable: there were considerable changes introduced by the drivers (licensed from a third party), and forcing it on existing users could have been troublesome. If you wanted the updated drivers, you needed to download (for free, from Adobe) and install them yourself. The installer notes clarified this, so as always, it pays to read the updater documentation carefully. There are often multiple documents for any new updater, including a FAQ, release notes, and more.

"Well, at least with Cumulative Hotfixes, I need apply only the latest, right?"

Again sadly, no, not necessarily. Let's say you're on CF 8.0.1 and never applied any of its 4 CHFs. You may think you can just apply CHF 4 and not bother with anything related to the previous three. Unfortunately, there are often manual steps within any one previous CHFs which you WOULD still need to apply.

In fact, this question (and answer) is important enough that I'll create another entry on that so it's not lost here (where this entry's title will lead some to think it applies only to installing CF from scratch).

"OK, so where do I find these updaters I need to check?"

Fortunately, there's one primary page you can keep an eye on, but technically there are various pages you may want to check, depending on your CF version, to get more or less detail.

First, the main "CF updates" page can be found here. It not only includes updaters and CHFs, it even has them for CF 9, 8, 7, 6, and even 5, as well as CF Builder (and CF Studio!)

For those on CF 9.0.1, it includes the latest CHF1 for CF 9.0.1.

Note that it also lists CHFs for CF 8.0.1, 8.0, and 7.0.2.

As for updaters, it includes those for 9.0.1, 8.0.1, 7.0.2, 7.0.1, 6.1, and so on. (Sadly, there's no HTML anchor within the page for sections like the 9.0.1 updater, so I can't offer a link to that part of the page. But if you're still running CF 9.0, go get it! It includes useful new features.)

As for getting a list of all the individual hotfixes (and CHFs) for any specific releases, there are individual pages for those:

While these list both hotfixes and CHFs, the info on the CHFs should be the same as the main "CF updates" page offered previously. But note that these version-specific update pages may well list new hotfixes that came out after a CHF. Do keep an eye on them.

(You may wish there was a better way to keep up with updates and hotfixes, whether a feed you could follow, or even an automated update mechanism. Sadly, there have been feeds but I'm not aware that is reliably updated by Adobe. And there is no automated update mechanism for CF from Adobe. But there is a free tool and service that solves both those problems, providing an automated update mechanism and constantly updated feed of hotfix notifications. For more information, see Merlin Manager.)

Finally, again, look for another entry to come on the sticky problem of how you can't "just apply the latest CHF" and assume you're good to go on any previous updates for that release. Sadly, it's just not often the case.

CF911: Lies, damned lies, and when memory problems not be at all what they seem, Part 1

Following on my earlier entry, CF911: Lies, Damned Lies, and CF Request Timeouts...What You May Not Realize, another common source of confusion and misunderstanding for people is when they think their server is "running out of memory", when in fact the problem is often not at all what they think. In this entry, I want to apply the same "cranky" tone :-) and extended explanation to this equally controversial/confusing topic.

I hear people raise concerns with memory problems quite often, whether in my CF Server Troubleshooting practice, or just in my participating in many mailing lists. Indeed, addressing this issue more than a few times the past couple of weeks has motivated me to create this, which will be a series of blog entries.

The series parts are expected to be:

  • Step 1: Determine if indeed you are getting "outofmemory" errors (this entry)
  • Step 2: Realize that OutOfMemory does not necessarily mean "out of heap" (entry to come)
  • Step 3: Diagnose why you really are running out of heap (if you are) (entry to come)
  • Step 4: Realize that having high memory usage is not necessarily a problem (entry to come)

Common refrains about memory issues

The common complaints about memory issues (and my quick responses, to give you a sense of where I'll be going in this series) are:

  • "CF has a memory leak" (to which I'd retort, no, generally, it does not. There's nearly always some other explanation)
  • "CF's use of memory is high" (which may not be a problem. If you're looking at memory from the OS perspective, it may not matter as much as heap use within CF)
  • "CF's use of heap memory is high" (to which you'd think, "ah, well that's got to be a problem, right?", but no, not necessarily, as I'll explain in part 4)
  • "CF is crashing all the time" (well, is it really crashing, or just hanging up and not responding? That's not a good thing, but it's very different from it crashing on its own)
  • "CF is running at 100% CPU before it crashes" (this could be related to memory problems, and more a consequence rather than a cause, or it could be entirely unrelated to memory issues)

So what if we really are suffering a problem?

I'm not saying there's never a real problem of really "running out of memory". It's just that often things are not at all what they seem (or what most presume them to be, from my experience helping people), and that's going to be the bulk of what I'll talk about in this series.

But what if your server is really crashing (or simply not responding), and you think/swear/know that it's a memory problem....

What should you do? Increase the heap size? Increase the permspace? Change the GC algorithm?

Sacrifice a chicken?

I'd say, none of them (though if you're in a rural setting, then perhaps cooking and eating the chicken might help settle your blood sugar so you can stay calm). Really, I know that goes against conventional wisdom, which seems always to suggest diving into the JVM settings. I'd say "hold on there, pardner."

Step 1: Determine if indeed you are getting "outofmemory" errors

This is one that surprisingly few people consider when faced with their server crashing or not responding. They go with whatever conveys to them a sense of there being a memory problem, perhaps adding their own experience or what they read, and they start chasing solutions.

I can't tell you how often I hear people lament that they've googled and found all manner of conflicting and confusing recommendations. And it doesn't help at all that they may be running on CF 8 or 9 (with Java 1.6) while reading about a "solution" written in the time of CF 6 or 7, when it ran on Java 1.4. Of course, the writer often won't have thought ahead to clarify that.

Instead, I'm saying, "stop, drop, and roll".

"Stop" the tail-chasing, "drop" into the pertinent logs directory in CF, and "roll" through them looking for an occurrence of "outofmemory".

Look first in the Console/Runtime/JRun logs

Let me be more explicit: the logs you want to look at for these outofmemory errors are NOT (necessarily) the ones you see in the CF Admin Log Files page. Those are the [cf]\logs directory (or are buried deep within an instance on Multiserver).

Instead, you want to see the "console" or "runtime" logs. Where those are depends on how you are running CF:

  • If you're running CF from the console, then look at the console display of logging info. (And if you started CF within CFBuilder, and did not set it to start as a Windows Service, then look in the console view of CFB for this info.)
  • If on Linux, look at the cfserver.log in the main CF logs directory.
  • If on Windows, running CF as a service, look instead at the -out.logs, found in the [cf]\runtime\logs directory (or [jrun]\logs on Multiserver, in which case there will be a prefix for each instance name in the log file names). You can generally ignore the -event logs there, as they typically just have a subset of what's in the -out logs. (Update for CF10: these logs are in fact not in the same directory as other CF logs.)

Some refer to these as the runtime logs, or the jrun logs, or perhaps the jvm logs. Whatever you may call them, their location is above and the explanation of their value follows.

Bonus topic: you can increase the max log size

(I will note that the way these -out.log files work, by default, is that they fill at 200k increments. Yep, not 200mb, but 200kb!, and you may blow through dozens of them in a few minutes if things are going nuts. That size is configurable, but not through means you'd normally expect. See the blog entry I recently published, CF911: How to control the size of CF's -out.logs.)

Look next in the hotspot/pid/jvm abort logs

Separately, there are some other potentially important logs that may relate info concerning memory problems: what some call the "pid", "hotspot", or jvm abort logs. The filename is in a form like hs_err_pidnnnn.log, with some number in place of the n's.

These logs are found in a very unexpected place (for logs): in the directory where CF stores the jvm.config. So on Standard/Server deployments, that's [cf]\runtime\bin. For Multiserver, that's [jrun]\bin. Look in there for any .log files. There is one log for each time that the jvm crashes due to certain kinds of problems. It could be a crash in the hotspot compiler, in hotspot compiled code, or in native code.

What matters most, for this post, is that again you look in them for any reference to the phrase "outofmemory". (Of course, you'll want to look at them in the event of a crash for any other message or info, but explaining the parts and value of these "pid" log files is beyond the scope of this entry.)

So, when you have a crash and you suspect it's a memory issue (or if you don't know the cause and want to learn more about what it may have been), you want to look in these two log directories mentioned above. Many never do, and this is part of why they end up chasing their tails, going instead on gut feelings or trying out various alternatives. I say: find the diagnostic info, and act on it.

Searching through the log files, the easy way

But rather than "look" at all these logs in these directories, one at a time, I suggest instead that you search them (I was tempted to say "stop, drop, and mole" above, since you're "ferreting" through the logs, but it just seemed a stretch.)

If you're on *nix, I don't need to give you any more info on how to search the files. Just grep it and rip it. :-)

If you're on Windows, though, you'll perhaps be tempted to use the good ol' built-in Windows Search tool to search the directory. Let me plead, for the sake of all things decent, please use something better. It's just not a completely reliable (nor fast) tool.

I have blogged about a wonderful free alternative, called File Locator Lite. Use that instead. (If you have another tool or editor you favor, that's fine. No need to bring that up here. I recognize those other options in the other blog entry.)

The beauty of FLL is that having installed it (which is fast, itself), if you right-click on the log directory (or any directory) and choose "File Locator Lite" from the menu, you can then just put the string outofmemory in the search box, and it will in a few moments show any files found in the lower left pane. Then, here's the real beauty over other tools, you don't need to double-click the files to open them. You just single-click each file, and in the right pane, it shows any lines that had the string that was searched. Brilliant, and again, a really fast way to find things.

Don't stop at the last outofmemory error before a crash

This feature of the File Locator Lite tool (to see all the lines in the file with that string) is especially useful in this case, because when searching for outofmemory errors, you also want to be able to quickly see the time for *all* the error messages you may find.

And you *do not* want to focus solely on the last error prior to the crash (or the slowdown, that made you want to restart CF).

Once you find one (or more) preceding the time of the crash, you want to look for any occurring prior to it. It may be that the problem started several minutes before the crash (or your restarting CF). Further, it may be that the outofmemory error just prior to the crash is different from the one that started things out.

Step 1 down, 3 more to go

OK, that's step 1 in determining whether memory problems are really at all what they seem. As I mentioned at the outset, the planned parts in the series are:

  • Step 1: Determine if indeed you are getting "outofmemory" errors (this entry)
  • Step 2: Realize that OutOfMemory does not necessarily mean "out of heap" (entry to come)
  • Step 3: Diagnose why you really are running out of heap (if you are) (entry to come)
  • Step 4: Realize that having high memory usage is not necessarily a problem (entry to come)

After I publish them, I'll update the lists here to link to them.

As always, I look forward to your feedback (pro, con, or indifferent).

CF911: How to control the size of CF's -out.logs

As a CF user or administrator running CF 6-9 on Windows, have you ever wondered how to increase the size of the console logs (-out log files in the [cf]\runtime\logs directory, or [jrun]\logs in Multiserver)? This entry will tell you how. It's quite easy to do, but it's not done using usual log file size control settings in CF's Admin or XML files.

The quick answer is to use either of two approaches: either the jrunsvc.exe in CF's runtime\bin (or [jrun]\bin), or do a manual registry tweak, both of which I show below.

BTW, if you don't know what CF's -out*.log files are about (they're important!), they're technically holding the console output for CF, when it's started as a Windows service. This can be vital information that is NOT logged in the normal [cf]\logs directory or Admin Log Files display. (If you start CF from the command line, then the same info is written to the command line instead and not to the log file.)

(If you're on *nix, pretty much the same info appears instead in the [cf]/logs/cfserver.log. I know some people have wondered about controlling the size of that file. What I discuss here applies only to Windows.)

Background on the sizing of the -out*.log files

You may have noticed that the default (since about CF 7) has been for these to grow no larger than 200k (not 200mb, but 200kb!), and up to 200 occurrences of them per instance.

Certain kinds of problems can lead to a lot of information being written to those files, such that in some instances, a given -out log file will contain only a few minutes of data (if that). And if too many of those happen, old log files will rotate off and you may soon not be able to see much than hours or days ago. It would be helpful, then, to let these files grow larger, both to make it easier to look at anyone and see a longer period of time in it, or to allow the sum of rotations to cover a longer period of time.

At the default of 200kb * 200 rotations, that's 40mb at the most that the files can use. I think most servers these days have a good bit more disk space they can afford to use! :-)

Even at 2000kb (2mb, a 10-fold increase), that would still be only a max of 400mb, or less than half a gig. Most drives these days can afford that. :-) How large you want to make them is your call.

Just beware of the temptation to perhaps lower the rotations and increase the size (like 100mb * 4 rotations). The larger the files get, the less likely you'll be able to easily open them with simple editors like NotePad (which starts to get slower to open, the larger the file is). That said, I can point out a great alternative tool for looking at larger files, though. I just blogged about it, Universal Viewer.

(Actually, prior to CF7, there was a bug where the -out.log file would grow unchecked and could itself grow to a GB, which is one the first reasons I learned about Universal Viewer. More on that issue and the fix for it in a moment.)

As for the filenames and rotations, if you've not noticed it, they use the instance name and a sequence number, such as coldfusion-out.log for the latest out.log (for a standalone CF instance), or myinstance-out199.log for the 199th rotation of a given instance's out log. The rotations starts with 1 until (the default of) 200 is reached, when it starts reusing the numbers starting at 1 for the newest logs. As such, at least until that rotation max is reached, the lower the number, the older the log.

BTW, If you wonder, the sequence number is kept in the registry, as another string value defined for the service (see the discussion below), as CurrentOverwriteLog. If you find that the logs are not rotating, that's a problem of permissions in the registry. If you've changed the user under which the CF service runs, you need to give that user permission to update this key. That's fodder for another blog entry some day.

You can't change these log's size and rotation values in the CF Admin, nor in jrun.xml!

Unfortunately, setting the size and rotation for these files isn't as simple as making a change in the CF Admin. You may notice that there is a setting on the Debugging&Logging>Logging Settings page for "Maximum file size" and "Maximum Number of Archives", but that setting controls the traditional CF logs (those shown in the Admin's Log Files page, or found in [cf]\logs, or deep within the directories for an instance in Multiserver mode.)

Also, some may know that there is an available jrun.xml file that has a jrunx.logger.FileLogEventHandler entry with available rotationSize and rotationFiles attributes, but those only control the -event*.log files (even if you tweak them as suggested by an old Macromedia technote). BTW, if you're interested, that file was in the [cf]\runtime\bin directory in CF 6.1, but since 7 it's been in [cf]\runtime\servers\coldfusion\SERVER-INF (or [jrun]\servers\[instancename]\SERVER-INF on Multiserver).

So how, then do you increase the size of the -out*.log files? Yes, we're finally ready for the "big reveal".

Supported approach: Using the jrunsvc to change the logfilesize and rotation

The good news is that it's fairly easy to update. You just need know your service name and the size you want to set, and then run the jrunsvc.exe command to make the change. It's found in the [CF]/runtime/bin or [jrun]/bin directory.

For instance, for a CF9 Standard or Enterprise Server mode service, whose name (as shown in the Services Control panel) is "ColdFusion 9 Application Server", you might use:

C:\ColdFusion9\runtime\bin\jrunsvc.exe -logfilesize 1000 "ColdFusion 9 Application Server"

You may need to tweak the start of the command, if your CF installation is in somewhere other than c:\ColdFusion9.

The great news is that CF will start leveraging the change immediately, even without a restart.

If you're in an Enterprise multiserver (multiple instance) deployment, the command is instead in C:\JRun4\bin, and of course your service name varies depending on whether it's the cfusion instance, in which case its name is "Macromedia JRun CFusion Server" (yes, it's "Macromedia", even in CF9), or an instance you've added, which may be "Adobe ColdFusion 9 as Instancename". Here's how to do it for the cfusion instance:

C:\JRun4\bin\jrunsvc.exe -logfilesize 1000 "Macromedia JRun CFusion Server"

Pretty simple and sweet. The logs have lots of great info, so increasing them can be really valuable and is something I help people do all the time as I help them solve problems in my CF server troubleshooting consulting services. Hope it helps you.

How'd I find this out?

So how did I find the jrunsvc was the solution? And what are the details? Well, it's is actually buried at the bottom of an old Adobe Hotfix note for CF7, which you would reasonably never think to look at if you were on CF 8 or 9.

But the information it offers does apply (after you apply the hotfix at the top, if indeed you are on CF 7.01 or 7.02. This is the solution to the problem where the -out*.log files would instead grow unchecked in size), even on CF 8 or 9.

How will you know that the change takes effect?

Well, of course you can watch the files to see if they grow larger than the default size. But you may prefer to check without waiting so long. :-) That leads to the next point.

Shortcut: Use the registry to confirm or indeed make the change

It turns out that this jrunsvc command simply updates the Windows Registry, with respect to information about the Windows service, so you can also confirm the change by looking in the registry. Better still, you could just skip doing it from the command line and just do the tweak yourself.

You can do it either with a gui like the windows RegEdit tool, or using the command-line Reg command.

And of course, the standard disclaimer about working with the registry applies: here be dragons. :-) If you make a mistake in editing the registry, or mistakenly delete something, it can have negative effects on many things. (But then some mistakes, like giving the wrong name to a new key, might cause no problem at all.)

This particular tweak is really quite simple (adding a single, simple key) and it's safe, if you're comfortable with registry tweaking. I've done it now on several dozen machines and never had a problem.

There's simply a key where all Windows Service settings are stored, and it would have one for each instance of a CF service you may have. For the CF9 Standard or Enterprise Server instance, it would be a key at:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\ColdFusion 9 Application Server

And it would have several values related to the CF service, and the value we're looking for is LogFileSize. Before executing the jrunsvc command above, there is no value for the key (by default), and so again CF defaults to a size of 200.

(BTW, don't be too confused by registry terminology. We might think that we're talking about a "key" named "LogFileSize" with a "value" of "200", but the registry instead uses "key" for the location (that longer string I referenced in the colored code block above the previous paragraph), and it calls this thing we're looking for a "value" and the number it would be set to is technically called "data". You can see this also if you use the "find" feature within RegEdit.)

Checking or Adding with the REG command

Anyway, if you want to confirm that the change you may have made with the jrunsvc (as above) took effect, you should be able to see it with this command-line command (for the service name we've been referring to):

reg query "HKLM\SYSTEM\CurrentControlSet\Services\ColdFusion 9 Application Server" /v LogFileSize

(Yes, I'm using a supported shortcut, with that initial "HKLM", as being short for "HKEY_LOCAL_MACHINE".)

It will report (if you set it to 1000):

LogFileSize REG_SZ 1000

If somehow it's not there, it would report:

ERROR: The system was unable to find the specified registry key or value.

I should note that you would also get that same error if you either pointed at the wrong service, or indeed gave the name of a service that did not exist at all, so be careful.

As for how to add the new value via the REG command, you would use something like this:

REG ADD "HKLM\System\CurrentControlSet\Services\ColdFusion 9 Application Server" /v LogFileSize /t REG_DWORD /d 1024 /f

It's about the same effort as using the jrunsvc command above. One advantage of this approach would be that if had to do this for dozens or hundreds of servers, you could use something like PowerShell to script it, and it has built-in support to work with the registry that might perhaps work more easily than scripting the jrunsvc command above.

Checking or Adding with the RegEdit GUI

Finally, for those who prefer the gui, just launch regedit (google to find out how to do that, though perhaps if you don't know what you're doing, you may want to pass on this, or get some help.)

Once regedit is open, drill down to that key location named above. Once it's selected, see if you see a value for LogFileSize there.

And if you want to add it, while there (with the cursor on the name of the key (such as HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\ColdFusion 9 Application Server), you can right-click on the values on the right and choose "New" (or in the Gui's top menu use Edit>New>) then choose String Value. Provide a name of LogFileSize and hit enter.

Then double-click on that new key, and enter 1000 for a value (if you want to change the size from the default of 200k to 1000k, or 1mb.)

The great news is that, as with the jrunsvc command, CF will start leveraging the change immediately, even without a restart.

What about changing the rotation count?

I'll note as well that, though not mentioned in the technote, if you run the jrunsvc command with a /? argument (or any invalid argument), it shows that there is also an option said to control the log file rotation: -logfileRotationLimit. But when I try to use that, even just to set it to the default of 200, it curiously replies with the error:

Error: log file rotaion limit must be at least 1000

What? (BTW, that misspelling of "rotation" is indeed in the error message.)

It makes no sense that that should be required to be "at least 1000". Perhaps someone from Adobe may want to look into this.

Oh well, you can at least change it via the registry tweak if you want. The new>string value's name should be LogFileRotationLimit.

FWIW, if somehow that error above is resolved, I'll note that you do need to provide the two arguments each in its own separate jrunsvc command, it seems.

Summary

I've confirmed from my own servers and others that making this change (either way) does indeed let the -out*.log files grow larger, which is a great help when solving certain kinds of troubleshooting problems.

In conclusion, I'm surprised to find that if you search google for the phrase:

"jrunsvc" "LogFileSize"

the only hit found (prior to me writing this) was that one Adobe technote (and one with the technote in Japanese). I'm stunned that no one else may have ever mentioned this. I know I've seen other blog entries where people have mentioned that they, too, had noticed that the traditional solutions (and even some other hotfixes) did not solve the problem of how to change the size and rotation of the -out*.log files.

While I do hope that many people will benefit from this entry, I will admit that there's a real sense of discovery and some awe when one gets to "walk" where seemingly few have gone before. :-)

Let me know what you think, whether you read this within days or years of this being published. Cheers.

CF911: Lies, Damned Lies, and CF Request Timeouts...What You May Not Realize

How often have you seen (or seen others complain of getting) a a CF page running longer than it's supposed to (perhaps in the CF server monitor, or FusionReactor or SeeFusion). Maybe you've set the CF Admin "request timeout" to 60 seconds, and you see a request running for 3 minutes, 3 hours, or 3 days! How can that happen?

Or perhaps you've seen this error from ColdFusion, in your logs or on-screen:

The request has exceeded the allowable time limit Tag: cfoutput

Do you know what this means? It's usually not what you think. I've even seen experienced CF developers who get thrown by this challenge. In this entry I'll try to help explain a very common problem and correct some misconceptions. I'll even contend that this info is often useless and indeed misleading (and therefore the feature producing it ought not be relied upon, and should even be turned off). Along the way, I'll share some things that I've not seen documented elsewhere.

Strap on your seatbelts. We're going for a bit of a ride (if it was easy and could be understood in the length of a tweet, then perhaps everyone would already understand it!) As always, I welcome feedback.

What the error usually does NOT mean, though most assume it

People are often mystified: "Why in the heck would a CFOUTPUT take a long time?"

Or perhaps they're a little more savvy as to what's happening, and they assume, "No, it's just that the CF timeout time was reached when it got to the CFOUTPUT". That could be.

Sadly though, in most cases, neither is what has happened. CF is usually NOT reporting that "here where the app timed out".

What the error usually DOES mean--the surprise

"OK, smarty-pants. What does the error really mean? Are you saying that CF is lying to me?" Well, often, yes, I'm afraid so, but it's not something nefarious.

Rather, what's more typically the explanation is that some previous activity in the page/request, such as a CFQUERY, CFHTTP, invocation of a web service, or the like is what really took a "long time".

If it was this which caused the request to exceed the timeout (either as defined in the CF Admin Settings page, or using CFSETTING RequestTimeout, or a Timeout attribute on a tag), you'd of course expect CF to report it then and there. The problem is that, often, it cannot report it "right then". And it's not its fault.

There are some operations CF/the JVM cannot interrupt

The problem is that CF (and the JVM) cannot interrupt a request while it's processing what's called a "native method". That is quite typically the mode that a request is in while it's waiting for a reply from a CFQUERY, CFHTTP, and so on. These operations talk to something outside of CF (like a database with CFQUERY, or another server with CFHTTP or web service call--which could even be requesting a page from the same CF instance, but technically the underlying Java httpclient process doesn't know that.) It could also happen with file or network operations.

So the request will wait for this long-running operation to finish. It can't stop it, not with the CF Admin request timeout, not with CFSETTING RequestTimeout, not with the kill features in the CF Server Monitor, FusionReactor, and SeeFusion. Nothing. It's like the Anti-Terminator: "it absolutely will not stop" (can't be terminated) until its task is completed.

So what happens when the long-running operation finishes? Is that when the request times out? An example

"Ok, I got it. The long-running operation (CFQUERY, CFHTTP, whatever) will not stop until it's finished. What happens then?"

Well, you see, that's where the confusion comes in. Let's use an example to make things crystal clear.

Say that the CF admin timeout is 60 seconds (not at all uncommon), or perhaps you set the timeout to 60 for a given template using CFSETTING. Anyway, let's say that the request in question gets 2 seconds into processing when it starts running a long-running query (for example). Let's say that query then takes 75 seconds. When the query is done, the request has now run for 77 seconds, which is 17 seconds beyond the timeout time.

We already know that it won't stop the CFQUERY itself (at least until the query is finished). But guess what: it also will NOT report that the timeout has been exceeded on that line (whatever it was, CFQUERY, CFHTTP, etc.) From my experience, CF doesn't check the time against the timeout at the end of operations, but rather at the beginning.

So instead, it will proceed to the next line of code. You'd think, "ok, then, it will stop on whatever is the next line of code and give you the error there, right?" Sadly, not necessarily, and it only adds to the confusion of the timeout message.

CF checks the time at the start of the next operation, but sadly only on SOME tags

So it's bad enough that it won't report the error on the tag that DID run long. Instead, we saw that it will proceed to the next tag/function. But curiously (tragically), CF will often NOT stop on THE next line of code.

Instead, I've observed that it only seems to check the time (against the timeout) at the beginning of CERTAIN tags, such as CFOUTPUT, CFLOOP, CFQUERY, and so on. Yes, I'm saying that I've confirmed that it will skip over various other tags (such as CFSET, or CFSCRIPT code, and more). I've not yet found any documentation as to the details of this.

So this is where the error gets confusing

So the bottom line is that not only does the the request NOT stop on the tag/function that took a long time, it doesn't even stop on "the next line" after that, which can make things all the more confusing/challenging to resolve.

Indeed, this is why you often see the error reporting as having occurred on a tag other than what was really the problem, and why you also can't just look at whatever was *the* line of code preceding that.

So what can you do with this information?

I don't mean to paint an entirely bleak picture. All is not lost. It's just a little more challenging than it should be.

At least first of all you can now know that when you see this error, you should NOT assume that it's reporting the line that really caused the problem. You can and should consider whether some earlier operation in the code could have taken a long time. In my experience, this is usually the situation.

I'll talk in a moment about some other tools that can help you understand where the time is really being taken. First, I do want to offer a clarification, lest anyone read my meaning too literally.

Are you saying the error message is always lying?

Well, no. You'll notice that I peppered my opening paragraphs with "usually", because it's certainly possible that a request could indeed be stopped on the very line that DID exceed the timeout.

Consider in our example that if the long-running query had run for only 57 seconds. Now, since it had taken 2 seconds before that, it now is one second short of timing out. Let's say the request then proceeds to loop over a query resultset or do some other operations that might take it a couple more seconds. When it does finally exceed the timeout, it may well happen right on the very tag that CF Reports as having "crossed" the timeout time.

But given the problem of how it only reports that on some tags (and not all), it could still be in this situation that it reports the wrong line of code. Just consider all the above as you evaluate what to make of the situation.

So how can I know what tag did take a long time?

So how can you know what tag is taking a long time, when a request it running long? or did take a long time, if it finished in the past? This is a bit more challenging. The good news is that there are tools that can help, including the CF Enterprise Server Monitor, FusionReactor, and SeeFusion.

Let's focus first on using these tools to catch requests while they're still running, which could be valuable if your server is hanging up because of some long-running requests. Then we'll talk about using the tools to capture the same information and make it available by email to review later.

The underlying feature/solution: stack tracing

In either case, whether watching requests live or capturing information about them to review in the future, and in all three tools, the solution to identifying why a request is long-running will be based on "stack tracing" that request.

This is a feature built-into the JVM, which is exposed easily by these tools, but missed entirely by many. More than that, some misunderstand stack tracing as something only shown at the bottom of error pages. (That is indeed a stack trace, but it's not nearly as useful as what I'm referring to here, which is for getting information on request while they're running, not when they have had an error.)

Stack tracing a running request will allow you to see exactly what line of CFML (if any) is running at that very moment, which again can be vital for resolving problems of long-running requests.

CF Enterprise Server Monitor

First, if you run CF Enterprise (8 or 9), you can use the CF Server Monitor to watch requests while running and see more details about them. If you use the available "start monitoring" button, you can see what requests are running in "Active Requests". Further, if you enable "start profiling", then if you double-click a running request, you can see in the middle of the next page a "stack trace", which shows the exact line of code that was executing at the time you double-clicked the request.

(Yes, I'm aware of the potential overhead of using the Server Monitor, though some people do over-state it in my experience. I'll point to other resources I've done on the Monitor, where I discuss its pros and cons, in a moment.)

Of course, viewing that stack trace at a random point in time during the life of a request could well mean simply that you'd see it executing just any random line, where perhaps a millisecond later CF will have moved on to another. The key is to refresh the stack trace, to see if CF indeed HAS moved on to a new line. If not, that line would be a smoking gun to investigate. Sadly the refresh icon in the Monitor doesn't update the details while viewing a running request. You need to go back to the list of active requests, open the request again, and repeat your observation.

Tools like FusionReactor and SeeFusion

Fortunately, tools like FusionReactor and SeeFusion make that refresh a lot easier, to obtain a stack trace while request is running. Each offers a button to take a stack trace of a running request. From the page they show you can usually determine the line of CFML code that's running (they each offer a little more stack trace detail than the CF Server Monitor does, but I'll point you soon to a resource to help you better understand them.)

More important, each of these tools offer a refresh button to refresh the stack trace, so that you can properly determine if the line that's executing has changed while you're refreshing.

That said, I will note that FusionReactor offers an important advantage with respect to that refresh operation: it ties the stack trace display to the specific CFML page that was being viewed (in Running Requests page) when you selected it. So if that request ends while you're looking at its stack trace, and you refresh it, FusionReactor will report that it's finished.

SeeFusion, on the other hand, would not. It knows only the thread id on which the requested page was running, so that if the request ends and you refresh the stack trace, it only knows to refresh the stack trace for whatever request is running on that thread. It can't (and won't) tell you if the given request has in fact ended, so you could now be looking at a new (and different) request, which could be quite confusing in this situation. It's incumbent upon you to notice (when using SeeFusion) that the stack trace you see in indeed for the same request you started with. (FR gives each request its own internal request id, which is how it avoids that problem.)

Catching running requests details when you're not watching the monitor tools

Of course, it's only possible to use the stack tracing features above if you can be on the server running the monitor tools when the problem occurs, right?

Well, not exactly: all three tools offer features to watch for a long-running request which can then send you notification by email of the details that would include a thread dump, which is a stack trace of all running requests.

In the CF Server Monitor, these are called Alerts. FusionReactor refers to them as Crash Protection notifications, and SeeFusion refers to this as "Active Monitoring Rules". See the documentation for each tool to find more information.

Learning more about stack tracking and the monitor tools

For more on all this, I discuss the idea of taking stack traces and thread dumps (which is a list of all stack traces for all current threads) in another blog entry.

I also discuss the CF Server Monitor, FusionReactor, and SeeFusion in several blog entries. The links just used are to the respective categories about each here in my blog. I've also discussed these topics (monitoring, stack tracing, and more) in various articles and presentations I've done.

The step debugger

Finally, some may point out that you can also get an idea of the time spent on any tag/function within a request if you use the interactive Step Debugger (whether that built into CFBuilder or the commercial FusionDebug alternative). As you step through the code, it would be clear if you got "hung up" on a line, though I don't know that I'd favor this as a solution here. Still, I've discussed these also in various blog entries, articles, and presentations.

(Sadly, you can't rely on the typical end-of-page debugging output, as enabled in the CF Admin, because that output is only shown if the request completes. We're referring here to pages that end in error.)

Can't I force CF to timeout some specific tags?

Again, in my experience (as I focus on CF server troubleshooting as a consultant), the root cause of problems in most "long-running" requests in CF is that some one tag or function is running long.

So could we perhaps force CF to timeout that specific tag? Well, yes and no.

You can, in fact, (and should) consider whether the tag in question might have its own TIMEOUT attribute or feature (and whether it will really help, as I'll explain.) Let's look at each of them.

Setting Timeout on CFHTTP, CFINVOKE, and others

There is indeed a TIMEOUT attribute on CFHTTP. Unfortunately, it won't ALWAYS keep the operation from exceeding that timeout. I've not quite put my finger on it (just haven't experimented completely), but if I had to guess, I'd say that it could be that if the operation is in the midst of returning data (from the server to CF), then it could perhaps time it out, whereas if it's waiting for the output then it may not be able to. Anyone know for sure?

There is also a TIMEOUT on CFINVOKE (for use when calling web services). Curiously, though, there is no TIMEOUT for use with CFOBJECT when calling web services (try it, it won't work, and none is documented). More curious still is that there IS a timeout available for use with createObject() (when calling a web service), though only by way of an argstruct argument that's new in CF8, which I have blogged about. Note as well that, according to the docs, that only times out the process of obtaining the WSDL, not the execution of any method in the web service.

There are also timeouts on various other operations that talk to something outside of CF (cfmail, cfftp on open/close operations, cfldap, cfpop, cffeed), though again it seems reasonable to expect that these may not always honor the timeout at the exact time given, as discussed above.

Setting Timeout on CFQuery

What about the elephant in the room, CFQUERY? Well, yes, it does have a TIMEOUT attribute, but many have found that it often does not timeout the query. Like the CFHTTP, I wonder if it may be a question of whether it's waiting for output (which likely can't be interrupted) or starting to receive it (which likely can be).

I will note that there's some promise in this regard, though for now not from CFML itself, but rather from the updated JDBC drivers in CF9 and the addition of a new timeout option in the CF Admin Datasource Advanced Settings page. You'll see that there is a new "query timeout" option that was not in CF before 9. I have blogged about it in more detail. It's not perfect: people are reporting different experiences with it (see the comments in the blog entry), and note (more important) that for now there seems no corresponding connection between this and the CFQUERY TIMEOUT attribute. (As I note there, I have raised a bug about this.) Still, it may be better than nothing and could help many, if you're on CF9.

So is there really nothing I can do for the hung requests?

OK, so we've explained why the requests don't timeout, often because they're talking to some remote process that is not responding. But what CAN you do when you're in this boat? Well, other than trying to add timeouts to the code as discussed above, generally nothing, at least for the requests that are already running.

And certainly a restart of CF will kill them off, or at least stop CF trying to talk to the remote process. (Of course, it's possible that upon restart, new requests will come in and try to connect to the same non-responsive or slow-responding remote process, so it could come right back.)

Stop the request on the remote server

But while you can't do much from WITHIN CF for these hung requests, there's one other way you may be able to stop the madness: stop the request on the remote server.

Once you can determine exactly what tag it is that's hung up (with the stack tracing tools above), you could then target whatever it was waiting for: the database server, a remote page called via CFHTTP, an exchange server using CFLDAP, etc.

Since the tools that let you stack trace the running request also show you the time the request started, you could use that info to go to the administrator of whatever service you're calling and ask if THEY may be able to kill the request. As soon as what you're waiting for stops, the CF request will continue. (Of course, it may only continue for a few milliseconds before it will be timed out by CF, as I discussed above, which is why I'm no fan of the CF request timeout feature, and think it should be turned off. More on that in a moment.)

Beware: you may not always find the remote server still "hung up"

Back to this issue of finding and killing the remote process that CF may be waiting for (that's causing your hung request), I should note that there may be times when you would go to the remote administrator and say, "look, I have this long-running CF requests that's waiting for this process (query, ldap request, web page, etc.) that is waiting forever for something that is running long on your server". And they may look and see nothing on their ends that's running long. Doh!

Yep, it can happen, for various reasons, so just be sensitive to this. You may really then have no way at all to kill the hung request. But note, again, that you may be able to use this observation to do something more to prevent the problem in the future, perhaps on the remote server side.

For instance, I've heard some describe problems where CFQUERY processing has hung talking to an Oracle database and (if I've got it right) the problem is an inconsistency between the CF datasource connection timeout and Oracle's "session" timeout. If anyone has more details on that, please do share.

But my point is simply that CF may be "waiting" for a call that will never be answered and can't be terminated from the other end. Again, in such cases, you can only kill them by restarting CF, and then you need to investigate how/why the call to the remote server are getting hung up in the first place. That's where logging information for diagnostic purposes may really come in handy, as is discussed next.

Logging what CF is getting hung up, to show to the remote administrator

If this problem (of calls to remote servers that take too long or get hung up) is happening often, and/or you can't always be logged in to see when it's happening using the tools above, another idea is to log for yourself whenever you make such a call to a remote server (that you know tends to hang up), such as putting a CFLOG statement before and after the CFHTTP, CFLDAP, CFQUERY, etc.

At least then you'll be able to see when it does and doesn't take a long time. The log would also help you by showing when it logs a start but no stop.

You could also code it so that it only logs when it's slow, but being able to confirm that it's generally fast and only sometimes slow may be itself useful diagnostic info.

Note that CF 9.0.1 by default adds new logging that does automatically log the start and end of calls to cfhttp, cffeed, and more to corresponding new logs (http.log, feed.log, etc.), which could also help.

Finally, as for logging the queries, you can get that from FusionReactor and SeeFusion automatically, as their "jdbc wrapper" features allow you to log every query (or optionally only those slower than a certain time). There is also a new "log activity" feature in the "advanced settings" of a CF datasource definition that could also log DB activity, though it is quite verbose and a tad unwieldy (not one line/row per query like the other two tools).

Bottom line: I'm no fan of request timeout features

So all that said, I'll repeat and clarify that I'm no fan of request timeout features, not that in the CF Admin, nor that offered in CF monitoring tools that offer to "kill requests" automatically, like the CF Server Monitor Alerts, FusionReactor's crash protection, and SeeFusion's active monitoring rules. I don't think they should be used, personally.

Let me be clear: I do love those tools and use them and help people use them daily. And I do love and highly recommend the features in those tools for sending you *alerts* when requests exceed a given time. What I don't like is them trying to kill them automatically, for all the reasons I outlined above. So I tell clients to turn off the "timeout requests" feature (though it does still make sense to use TIMEOUT attributes on certain tags, or may make sense to implement the CFSETTING RequestTimeOut on some page where you know that the reason it runs long is not one of these things that can't be killed anyway.)

Instead, I recommend (and help my clients daily) to use the alert info (from the CF Server Monitor, or FR or SF) to be notified if/when requests ARE taking too long--and NOT to kill them. Note that these tools all send the notice *as soon as* the request takes too long (whatever time you set), whereas CF's "log slow requests" feature only logs when requests end--and that's only IF they do end without failing.

So yes, get notified that requests are taking too long. Use the info in the alerts, which includes the stack trace info I discuss above. Do find and resolve the problem. Don't rely on (or in my opinion even use) auto-kill features, when in fact they nearly never are able to kill really problematic requests anyway.

Yes, yes, I do realize that there are some requests that CAN be interrupted by these timeout/kill features, but I'll assert that such requests are far less commonly the cause of any serious problems. Your mileage may vary, of course. But I make my statement based on several hundred instances of helping folks solve typical CF server problems.

So why is the "timeout requests" setting there?

One last thought worth considering: someone might reasonably ask, "Charlie, why are you such a hater of the setting? If Adobe has it there, it must be for a good reason."

Here's what I'd say to that: sure, when CF originally ran on C++ (prior to CF 6), perhaps this setting could be reasonably relied upon to ensure that requests would not run any longer than the set time. (I don't recall, but perhaps even then there may have been at least SOME tags that it couldn't interrupt.) But clearly since CF 6, in the Java model, this is no longer the case.

And yet if you read the Admin page, or its help, or the docs, or the comments from nearly anyone who considers the setting, the presumption is that this WILL stop requests from running longer than the x number of seconds indicated.

Why am I so impassioned/manic about this?

I hope I've made clear in this entry why I think that's not only wrong to conclude (in nearly all cases), but worse it sets up a tragic misconception of how CF works. If you think this should and will stop long requests (or that the alert features of the monitors will kill them), then you're going to be in for a shock when requests do hang up for an extended period of time. What are the implications?

  • You may totally under-estimate how many simultaneous request threads you should enable.
  • You may never pay attention to tools like CFSTAT or jrun metrics (to observe at least "how many requests are running" at any given time), which will help you see if/when requests are hung.
  • You may never bother to learn how to use the CF Server Monitor (or FR or SF), all of which can go still further and show not just how many requests are running (possibly hung) but a) how long, b) what the URL is, c) what the IP address is, and so much more, which can help you find and resolve problems.
  • You may never bother to learn how to do the stack tracing that I discuss above, which is often vital to understanding where and why any given request is hung (or was at the time an alert was thrown)
  • You may never bother to analyze logs that show the activity patterns (how many requests are running at periodic intervals, such as the FusionReactor "resource log" reports.) It's really THAT information that is vital to your understanding what to set for your simultaneous requests setting.
  • and so on

All of this info (and understanding) is VITAL to a very important and common class of CF server troubleshooting: why is CF up but not responding? It may be that requests are hung.

But if you assume, "well, they can't be running any more than x seconds", then you'll start to think "so it must be something else", and you figure you may as well just restart CF. Or you start reading about how someone suggests you change your JVM settings (which may have NOTHING TO DO with this problem, and not only not solve it but could cause new ones), and so on.

Again, I see this all the time. I hope by this entry to have helped avoid some of the very common misunderstandings on this subject that I frequently see either on lists, or in emails to me, or in my consulting engagements. If I seem passionate about it, it's because I am. Same with the memory issues I discuss in the related and similarly titled entry, CF911: Lies, damned lies, and when memory problems not be at all what they seem, Part 1.

Need More Help?

I mentioned above that I provide CF Server Troubleshooting consulting. If you need some help understanding how to apply the information above to your specific problem (or need help with any CF server, or CFBuilder, problem), I'm happy to help.

I don't need to come on-site, nor do you need to give me remote access. Instead, we can work easily and securely right over the web using Adobe Connect.

And I don't have any minimum time-block requirement--and I even offer a satisfaction guarantee. To learn more, including rate plans, see my consulting page. (I hope some will forgive this brief commercial here. I don't generally mention it, but since some say that they didn't know I offer such services, it seemed an appropriate point to mention it.)

Conclusion

So phew, another really long blog entry. But I hope it may help some people (and help some who help others).

As always, I welcome your feedback, corrections, additions, etc. Really, I ask for your feedback. If it helped, please say so. My blog doesn't get the traffic of many others. I often see that hundreds of people have read things, but few ever comment. I can't know if it's that I've answered every question (I can hope so), or that you weren't impressed. Like the guy said in Dirty Harry, "I gots to know". :-) Sometimes, all it takes is a few people to "prime the pump" and start commenting to lead others to do so. Why not grab the handle? :-) And if you think this would be helpful info for others, please do share it (tweet about it, mention it on mailing lists/forums when you see the problem raised, etc.)

I'm planning to better organize and package CF server troubleshooting resources (mine and others). We have a lot of great info out there for those solving CF problems. It can just be a challenge to sort through it all. I hope to help solve that. Look for more news to come on that front in time.

How do I love FusionReactor? Let me count the ways (6 minute interview video)

The folks behind FusionReactor have started a YouTube video channel and they recently posted a 6-minute interview with me that we did at CFUnited. In it, they ask and I recount the reasons I appreciate and recommend it. Check out the video, embedded also below.

FusionReactor is one of the leading CF Server Monitor tools, which works not only with CF 6/7/8/9, either Standard or Enterprise, but it also works with Railo, Open BlueDragon, and even BlueDragon JX 7.1. In fact, it works with any J2EE/JEE server or servlet engine.

If you're running a site on any of those platforms and ever have problems of slowness, instability, or any other "curious" problems, or just need to better understand the nature of requests that CF is processing, and how well (or poorly) it's doing it, FusionReactor is a great tool, for the reasons I outline. It's like having x-rays into the app server.

I've written and spoken about the tool quite a bit, and have a FusionReactor blog category here with over a dozen entries here, as well.

New for CF9 (and 9.0.1): a query timeout that really works, with a caveat

This is a very interesting change in CF9 (and 9.0.1), which has slipped under the radar for the most part as far as I can tell.

Did you know there is now a setting in the DSN page of the CF Admin (for most DB drivers) that allows you to set a maximum timeout for queries against that DSN? It's a new feature enabled for the DataDirect drivers udpated in CF 9. The caveat? It is ONLY settable there, not in CFQUERY itself, which is a shame (the existing TIMEOUT attribute is not the same and generally does not work). Still, the value of this even at the DSN level is too important to ignore for some challenges. More on that (and some other thoughts) in a moment.

As for the setting, it's in the "Advanced Settings" section for a DSN in the CF 9 Admin, and it's called "Query Timeout". This should not to be confused with the older settings, "Timeout" (which is about inactive connections) or "Login Timeout" (which is about logging in to the connection). The screenshot at right shows all 3. (This blog entry continues, with more information below it.)

I've run a test, and it really does do the job, which is huge. Why? Because it's been a long-time issue that if a CFQUERY got hung up waiting for a response, that request thread (doing the CFQUERY) is then hung until the query finishes, which can sometimes be many minutes, or even hours or days, due to some odd situations. More important, a thread waiting for a query with no timeout can't be terminated (by the JVM, or CF, or the monitoring tools) because the thread was in a native thread state.

With this new option specified, if the request exceeds the timeout, the request does now fail, with a JDBC error, "Execution timeout expired." The same test does NOT timeout with the older cfquery TIMEOUT attribute.

Here are some other notes on the new feature:

  • It works with most of the database drivers. I have confirmed that the setting appears in the DSN settings page for SQL Server, Oracle, MySQL (datadirect), DB2, Informix, Sybase) in CF 9 Enterprise, and in both SQL Server and MySQL (Datadirect) on CF 9 Standard.
  • The timeout's specified in seconds.
  • You can learn more in the CF Admin guide, in this specific page.
  • Oddly, the Admin manual page above only references this new setting in that CF Admin manual is in the the MySQL settings, but again it does appear in all the drivers above.
  • The manual page does even reference the other DBMSs by name in its naming the methods of the Admin API (for other DBMSs) which you can use as well, which can be used to set this default setting in the DSN programmatically.
  • That said, again there is sadly no new QueryTimeout (or QTimeout) attribute for CFQUERY, so for now we can only set this at the datasource level, not per query.
  • I've raised the concerns above on the CF Admin livedocs page (or whatever we are to call it now.)
  • If you look under the covers (in [cf]\lib\neo-datasource.xml), there is in fact a querytimeout connectionstring that this setting controls. If only there was a way we could pass connectionstring values to the CFQUERY, we'd be golden. Some may recall use used to have just such an attribute (ConnectString), but sadly it was deprecated in CF 6. I did try it, to no avail.
  • I will raise have raised a bug with Adobe to get a new attribute for CFQUERY related to this. When I do that, I'll report it here. It's bug 83592. Please add your vote of support for it.
  • There's another page on this Admin setting, in the CF Dev Guide, for those who may be interested in following any other possible places where this feature may be discussed (in the comments there).
  • If you want some code to use to test a request waiting a long time for the DB to return, most databases offer a statement that tells the DB to wait for some time. In SQL Server, that's WAITFOR DELAY 'hrs:mins:secs'. Just use that in a CFQUERY, assuming your DSN definition doesn't limit what SQL statements you can use, in the 'Allowed SQL' section of the Adcanced Settings page.

Why is this setting important?

I think this is a very important setting, and though it has been a hidden gem, it seems, it's one that people should consider. That's why I've flagged this in my CF911 category.

If you're suffering situations where requests are hanging due to long-running queries, and you have not been able to solve the real root cause for why they are hanging (which I always recommend first and foremost), then at least this option can help avoid a situation where queries can run without time limit. An Admin can decide that no queries should be allowed to run more than x seconds.

With that power comes responsibility, of course, and caution. You wouldn't want to preclude someone being able to run a query that really needed to take a long time. That's why it's really better if this was settable on a per-query basis. (And no, the CF page timeout settings are NOT the solution here, because again as I said above, they cannot timeout some kinds of long-running tags, like CFQUERY, CFHTTP, CFINVOKE of a web service, etc.)

What one could do, though (for now), is create different DSNs, where one could be used for most query processing, and another could be used for long-running requests. Yes, it's ok to have 2 DSNs point to the same DB. This same technique has been used when wanting to have most queries run against one DSN with a limited set of "allowed SQL" (per the DSN advanced settings) while another DSN has unfettered SQL access.

Hope this is helpful to some. Let me know what you think, whether this was helpful or if you feel I left something out. Especially please let me know if you may know of a way that we can indeed pass querystring values on a per-query basis.

Some code to throttle rapid requests to your CF server from one IP address

Some time ago I implemented some code on my own site to throttle when any single IP address (bot, spider, hacker, user) made too many requests at once. I've mentioned it occasionally and people have often asked me to share it, which I've happily done by email. Today with another request I decided to post it and of course seek any feedback.

It's a first cut. While there are couple of concerns that will come to mind for some readers, and I try to address those at the end, it does work for me and has helped improve my server's stability and reliability, and it's been used by many others.

Background: do you need to care? Perhaps more than you realize

As background, in my consulting to help people troubleshoot CF server problems, one of the most common surprises I help people discover is that their servers are often being bombarded by spiders, bots, hackers, people grabbing their content, rss readers, or even just their own internal/external ping tools (monitoring whether the server is up.)

It can either be that there are many more than they expect, coming more often than they expect, or they may come extremely fast to your server (even many times a second). This throttle tool can help deal with the latter.

Why you can't "just use robots.txt and call it a day"

Yes, I do know that there is a robots.txt standard (or "robots exclusion protocol") which, if implemented on your server, robots should follow so as not to abuse your site. And it does offer a crawl-delay option.

The first problem is that some of the things I allude to above aren't bots in the classic sense (such as RSS readers, ping tools). They don't "crawl" your site, so they don't regard that they need to be told how/where to look. They're just coming looking for a given page.

A second is that the crawl-delay is not honored by all spiders.

The third problem is that some bots simply ignore the robots.txt, or don't honor all of it. For instance, while Google honors the file in terms of what it should look at, my understanding is that it does not regard it with respect to how often it should come. Instead, Google requires you to implement the webmaster toolkit for your site to control its crawl rate.

Then, too, if you may have multiple sites on your server, the spider or bot may not consider that in deciding to send a wave of requests to your server. It may say "I'll only send requests to domain x at a rate of 1 per second", but it may not realize that it's sending requests to domains x, y, z (and a, b, and c) all of which are one server/cluster, which could lead a single server to in fact be hit far more than once a second (in that scenario). It may seem that's an edge case, but honestly it's not that unusual from what I've observed.

Finally, another reason all this becomes a concern is that of course there can be many spiders, bots, and other automated requests all hitting your server at once sometimes. My tool can't help with that, but it can at least the other points above.

(As with so much in IT and this very space, things do change, so what's true today may change, or one may have old knowledge, so as always I welcome feedback.)

The code

So I hope I've made the case for why you should consider some sort of throttling, such that too many requests from one IP address are rejected. I've done it in a two-fold approach, sending both a plain text warning message and an http header that is appropriate for this sort of "slow down" kind of rejection. You can certainly change it to your taste.

I've just implemented it as a UDF (user-defined function). Yes, I could have also written at in all CFscript (which would run in any release, as there nothing that couldn't be written in script in that code--well, except the CFLOG, which could be removed). But since CF6 added the ability to define UDFs with tags, and to keep things simplest for the most people, I've just done it as tags. Feel free to modify it to all script if you'd like. It's just a starting point.

I simply drop the UDF into my application.cfm (or application.cfc, as appropriate). Yes, one could include it, or implement it as a CFC method if they wished.

<cffunction name="limiter">
   <!---
      Written by Charlie Arehart, charlie@carehart.org, in 2009, updated 2012
      - Throttles requests made more than "count" times within "duration" seconds from single IP.
      - sends 503 status code for bots to consider as well as text for humans to read
      - also logs to a new "limiter.log" that is created automatically in cf logs directory, tracking when limits are hit, to help fine tune
      - note that since it relies on the application scope, you need to place the call to it AFTER a cfapplication tag in application.cfm
      - updated 10/16/12: now adds a test around the actual throttling code, so that it applies only to requests that present no cookie, so should only impact spiders, bots, and other automated requests. A "legit" user in a regular browser will be given a cookie by CF after their first visit and so would no longer be throttled.
      - I also tweaked the cflog output to be more like a csv-format output
   --->

   <cfargument name="count" type="numeric" default="3">
   <cfargument name="duration" type="numeric" default="3">

   <cfif not IsDefined("application.rate_limiter")>
      <cfset application.rate_limiter = StructNew()>
      <cfset application.rate_limiter[CGI.REMOTE_ADDR] = StructNew()>
      <cfset application.rate_limiter[CGI.REMOTE_ADDR].attempts = 1>
      <cfset application.rate_limiter[CGI.REMOTE_ADDR].last_attempt = Now()>
   <cfelse>
      <cfif cgi.http_cookie is "">
         <cfif StructKeyExists(application.rate_limiter, CGI.REMOTE_ADDR) and DateDiff("s",application.rate_limiter[CGI.REMOTE_ADDR].last_attempt,Now()) LT arguments.duration>
            <cfif application.rate_limiter[CGI.REMOTE_ADDR].attempts GT arguments.count>
               <cfoutput><p>You are making too many requests too fast, please slow down and wait #arguments.duration# seconds</p></cfoutput>
               <cfheader statuscode="503" statustext="Service Unavailable">
               <cfheader name="Retry-After" value="#arguments.duration#">
               <cflog file="limiter" text="'limiter invoked for:','#cgi.remote_addr#',#application.rate_limiter[CGI.REMOTE_ADDR].attempts#,#cgi.request_method#,'#cgi.SCRIPT_NAME#', '#cgi.QUERY_STRING#','#cgi.http_user_agent#','#application.rate_limiter[CGI.REMOTE_ADDR].last_attempt#',#listlen(cgi.http_cookie,";")#">
               <cfset application.rate_limiter[CGI.REMOTE_ADDR].attempts = application.rate_limiter[CGI.REMOTE_ADDR].attempts + 1>
               <cfset application.rate_limiter[CGI.REMOTE_ADDR].last_attempt = Now()>
               <cfabort>
            <cfelse>
               <cfset application.rate_limiter[CGI.REMOTE_ADDR].attempts = application.rate_limiter[CGI.REMOTE_ADDR].attempts + 1>
               <cfset application.rate_limiter[CGI.REMOTE_ADDR].last_attempt = Now()>
            </cfif>
         <cfelse>
            <cfset application.rate_limiter[CGI.REMOTE_ADDR] = StructNew()>
            <cfset application.rate_limiter[CGI.REMOTE_ADDR].attempts = 1>
            <cfset application.rate_limiter[CGI.REMOTE_ADDR].last_attempt = Now()>
         </cfif>
      </cfif>
   </cfif>
</cffunction>

Then I call the UDF, using simply cfset limiter(), as shown below. That's it. No arguments need be passed to it, unless you want to override the defaults of limiting things to 3 requests from one IP address within 3 seconds.

<!-- the following must be done after cfapplication -->
<cfset limiter(count=3,duration=5)>

Note that since the UDF relies on the application scope, you need to place the call to it AFTER a cfapplication tag if using application.cfm.

Caveats

There are definitely a few caveat to consider, and some concerns/observations that readers may have. The first couple have to do with the whole idea of doing this throttling by IP address:

  • First, some will be quick to point out a potential flaw in the approach of throttling by IP address is that you may have some visitors who are behind a proxy, where they appear to your server to all be coming from one ip address. This is a dilemma that requires more handling. For instance, one idea would be to key on yet another field in the request headers (like the user agent), so that you use two keys to identify "a user" rather then just the IP address. If you think that's an issue for you, feel free to tweak it and report back here for others to benefit. I didn't choose to bother with that, as in my case (on my site), I just am not that worried about the problem. Note that the log that I create will help you determine if/when the UDF is doing any work at all.
  • Other folks will want to be sure I point that many spiders and other automated request tools now may come to your site from different IP addresses, still within that short timespan. My code would not detect them. For now, I have not put in anything to address this (it wouldn't be trivial). But the percentage of hits you'd fail to block because of this problem may be relatively low. Still, doing anything is better than doing nothing.
  • Speaking of the frequency with which this code would run, someone might reasonably propose that this sort of "check" might only need to be done for requests that look like spiders and bots. As I've talked about elsewhere, spiders and bots tend not to present any cookies, and so you could add a test near the top to only pay attention to requests that have no cookie (cgi.http_cookie is ""). I'll leave you to do that if you think it worthwhile. Since there's a chance that some non-spider requesters could also make such frequent requests, I'll leave such a test out for now. (Update: I changed this on 10/16/12 to add just that test, so the code above now only blocks such requests that "look like spiders". A legit browser visitor would get the cookie set by CF on the first request, so won't be impacted by this limiter.)
  • Someone may fear that this could cause spiders and bots to store this phrase "You are making too many requests too fast, please slow down and wait" (or whatever value you use). But I will note that I have searched Google, Bing, and Yahoo for this phrase and not found it as the result shown for a page on any site that may have implemented this code. (Since I give the status code of 503, I think that's why it would not store it as the result.)
  • Here's a related gotcha to consider, if you implement this and then try to test it from your browser and find "I can't ever seem to get the error to show" even when I refresh the page often. Here's the explanation: some browsers have a built-in throttling mechanism of their own and they won't send more than x requests to a given domain from the browser at a time. I've spoken on this before, and you can read more from yslow creator Steve Souders. So while you may think you can just hit refresh 4 times to force this, it may not quite work that way. What I have found is that if you wait for each request to finish and then do the refresh (and do that 4 times), you'll get the expected message. Again, use the logs for real verification of whether the throttling is really working for real users, and to what extent. (Separately, after the update above on 10/16/12 to only limit spiders/bots/requests without a cookie, that's another reason you'll never be throttled by this in a regular browser.)
  • Finally, someone may note that technically I ought to be doing a CFLOCK since I am updating a shared scope (application) variable. The situation in which this code is running is certainly susceptible to a "race condition" (two or more threads running at once, updating the same variable). But in this case, it's not the end of the world if two requests modify the data at once. And I'd rather not have code like this doing any CFLOCKing since it's prospectively running on all requests.

Some other thoughts

Beyond those caveats, there are a few more points about this idea that you may want to consider:

  • Of course, an inevitable question/concern some may have is, "but if you slow down a bots, might that that not affect what they think about your site? Might they stop crawling entirely?" I suppose that's a consideration that each will have to make for themselves. I implemented this several months ago and haven't noticed any change either in my page ranks, my own search results, etc. That's all just anecdotal, of course. And again, things can change. I'll say that of course you use this at your own risk. I just offer it for those who may want to consider it, and want to save a little time trying to code up a solution. Again, I welcome feedback if it could be improved.
  • Some may recommend (and others may want to consider) instead that this sort of throttling could/should be done at the servlet filter level, rather than in CFML (filters are something I've written about before .) Yep, since CF runs atop a servlet engine (JRun by default), you could indeed do that, which could apply then to all applications on your entire CF server (rather than implemented per application like above.) And there are indeed throttling servlet filters, such as this one. Again, I offer this UDF for those who aren't interested in trying to implement such a filter. If you do, and want to share your experience here, please do.
  • BlueDragon fans will want to point out that they don't need to code a solution at all (or use this), because it's had a CFTHROTTLE tag for several years. Indeed it has. I do wish Adobe would implement it in CF (I'm not aware of it existing in Railo). Until then, perhaps this will help others has it has me. (The BD CFThrottle tag also implements a solution for the problem of possible visits by folks behind a proxy, with a TOKEN attribute allowing you to key on yet another field in the request headers.)
  • There is another nasty effect of spiders, bots, and other automated requests, and that's the risk of an explosion of sessions which could eat away at your java heap space. People often accuse CF of a memory leak, which it's really just this issue. I've written on it before (see the related entries at the bottom here, above the comments). This suggestion about throttling requests may help a little with that, but it really is a bigger problem with other solutions, that I allude to in the other entries.
  • It would probably be wise to add some sort of additional code to purge entries from this application-scoped array, let they grow in size forever over the life of a CF server. It's only really necessary to worry about entries that are less than a minute old, since any older than that would not trigger the throttle mechanism (since it's based on x requests in y seconds). It may not be wise to do this check on every request, but it may be wise to add some another function that could be called, perhaps as a scheduled task, to purge any "old" entries.
  • Finally, yes, I realize I could and should post this UDF to the wonderful CFlib repository, and I surely will. I wouldn't mind getting some feedback if anyone sees any issues with it. I'm sure there's some improvement that could be made. I just wanted to get it out, as is, given that it works for me and may help others.

Besides feedback/corrections/suggestions, please do also let me know here if it's helpful for you.

Free tools for SAN monitoring, VM Monitoring and more...and their educational site

Folks know that I like to share news of tools (see my CF411 site), but I want to point out here a couple of free ones in particular that may address problems people are having in new/modern configurations: one is a tool for monitoring a SAN, and the other is for monitoring VMs.

It also gives me a chance to offer some props for the site of the company behind the tools, SolarWinds, which again many may find valuable in educating not only about the tools but the topics that the tools help with.

The free SAN and VM monitoring tools

The two tools (and one more for bonus) are:
  • SolarWinds Free SAN Monitor - keep a close eye on the performance & capacity of your storage arrays and become a storage superhero!
    Note also:
  • VM Monitor - continuously monitor a VMware® ESX Server and its virtual machines with at-a-glance virtualization health statistic
    Note also:
  • WMI Monitor - monitor your Windows® apps and servers in real time, using built-in, community-sourced, and customizable application templates!
    Note also:

I haven't yet used them myself, so this isn't so much a recommendation of the tools but rather a recommendation that you consider them if you are interested in what they have to offer.

The company offers still more free tools, as well commercial ones of course.

A company that gets how to educate you about their products

You may have noticed above that I offered as well links to videos about each product. SolarWinds has really done a great job offering educational resources, especially videos, and organizing them into categories such as tech talks, webcasts, and more.

Indeed, if you may be new to network management (which can be a broad and/or deep subject, appealing variously to generalist IT geeks and hard-core network admins), they offer lots of compelling introductory resources, including their geek guides and even certification training . Of course they also have a helpful blog and twitter feed.

Just as I previously praised the Mura folks as a "company who got it right" in terms of setting up a compelling, informative web site for IT folks, I really have to say the same for the SolarWinds folks. Congrats, and thanks.

New DataDirect JDBC Type 5 drivers (for SQL Server, MySQL, Oracle, and more)

I think most folks know that the underlying database drivers in CF are from DataDirect. Well, they've announced new "Type 5" drivers. While you would have to buy and install them separately from those built-into CF (for now, as Adobe has not yet certified CF for use with them), I think some people may want to give them serious consideration even before then.

Several Performance Advantages, and Failover As Well

Among their advantages (over the older Type 4 drivers that are built into CF) are faster performance and lower memory use (more here), as well as greater tunability in general (discussed among other features between 4 and 5), and still more.

For some, more compelling still is the new driver-based failover, to have the driver detect a connection failure and switch communications to another DB instance (much like CF 8 Enterprise added the ability to do the same with mail server configuration in CF). This could be a huge improvement for many.

Getting More Info, Trials, Purchase

You can learn more about (and purchase or get a trial of) the drivers at http://www.datadirect.com/products/jdbc/index.ssp. The page also includes links to more on the benefits of type 5, comparison to Type 4, and so on. (Sadly, I just missed their cleverly titled "These go to 11" webinar earlier today. Perhaps they'll offer another soon.)

When Might Adobe Bundle Them in CF?

You may have caught that I referred to "purchase" of the drivers. Sadly, no, the drivers are not free. Adobe has licensed and included them in past releases of CF. So when Adobe may modify CF to bundle the new drivers?

Some have heard that Adobe's working on an updater for CF 9 (mentioned publicly at cf.Objective() among other places). Can we expect the drivers to be updated in that release? It's hard to say. In the 7.0.2 timeframe, the then-new drivers (3.5) came out just after the release and so were not included in the updater but instead were mentioned as a manual option in the CHF 1 technote. (Many missed that and to this day I suspect many shops still running CF 7 still have updated those drivers.) The point is, they did not include it with the updater (as I blogged about at the time in this blog entry.)

I can understand why they may not bundle new DB drivers with a CF updater: with updaters, they have to rebuild and retest all the installers on all the platforms, and updaters are generally not about adding "new features". It's a slippery slope as to whether driver updates are "new". It may also be a matter of timing, with respect to when Adobe learned about the drivers (likely sooner than most of us) and whether they found them compelling enough to consider, and at what cost in terms of backward compatibility. It's worth noting that the Type 5 drivers do assert to bring all their benefits "without code changes", so who knows?

The bigger question for some may really be simply, "I'm ok if they make me install it manually. It would just be nice to not have to license them ourselves." That's a fair point, and we just won't know until Adobe does address this issue (perhaps as they did in 7.0.2 CHF 1.) Until then, we have to wait and see.

But again, I do think the update compelling enough that shops with interest in performance should at least consider testing out the free trial to see how things go, or to prepare for the possibility of leveraging that new failover feature. (It also remains to be seen if Adobe may somehow try to force that failover feature to be an Enterprise-only feature, just as the mail server failover is.)

About the Company and Product Name Variations

Some may wonder if they visit the site for the drivers and notice the company being named Progress, when it used to be Merant. You can read about the fairly long chain of changes in the company history page.

One other thing worth noting is that the product is sometimes referred to as "DataDirect JDBC Type 5 drivers" (as I show in the subject here) and yet also sometimes as "DataDirect Connect for JDBC". They're the same thing.

One Boo Boo on the DataDirect Site

As I was tooling around their site, I happened on the page about use of the tool with JBoss servers. They said this:

The top web and application server vendors embed or recommend DataDirect JDBC drivers as part of their J2EE certification strategies (JBoss, IBM WebSphere, Sun Microsystems, Oracle BEA, and Macromedia.)

Can you spot the mistake? There is no more Macromedia, of course. I tried to find a page on the site to send a comment, but their Contact page didn't seem to offer a suitable option for such a comment. Oh well, so I am adding it here in case someone who cares can get something done about it.

Finally, I'll point out another useful page on their site, not new to the Type 5 drivers, but a helpful list of performance tips.

The Ultimate Var Scope Resource list? Understanding/resolving problems with the var scope in CFML

If you or anyone you know ever wants to get up to speed on the "var scope problem" in CF, you may be challenged by the fact that there are many discussions of the topic, spread across many blogs. I've accumulated here a starting list of several of the key ones I know of. I certainly may have missed some, so I welcome suggestions of more.

I think it's helpful to have all the resources in one place. Indeed, ultimately I'll move this to my "resource lists" page where I keep similar "compendia".

I created the list of VAR scope resources today after helping a client with a problem which seemed related to this classic problem: the need to remember to var scope your variables in CFCs. It's often the cause of subtle bugs. Like many, they still hadn't heard about the problem (or had seen mention of it but didn't really understand it).

So if you're in that place, or know someone who may be, here are resources to help get started on understanding the topic and related issues. As always, the CF community has rallied the troops on the matter, and several folks have blogged in various detail or on various related aspects.

About the resources

The first few elaborate on the problem, and the first one even includes a live running example to demonstrate the point. Then a couple explain some related issues.

I also then list resources on the related new "local" scope in CF 9, and some more that discuss compatibility issues with that.

Note that these may have been written any time in the past couple/few years, so keep that in mind and be sure to check the comments as well.

The resources

First, some general resources introducing the var scoping issue/feature:

And some that address various issues related to using the VAR scope:

And some on the change in CF9, for the local scope:

And some tangential discussions of that, including some compatibility issues with the new scope:

The VarScope tool

Finally, of course, as most of these resources point out, be sure to use Mike Schierberl's wonderful varscoper tool to help find whether you have instances of this problem. It tells you literally what to change, where, to don't let concern of this problem overwhelm you.

Conclusion

Hope those are helpful. Again, it's just a starting list. I welcome additions, and I look forward to your comments. In time, I'll move this (and any suggested additions) to my "resource lists" page. Check that out for similar lists of resources on various subjects.

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.005.

Managed Hosting Services provided by
Managed Dedicated Hosting