[Looking for Charlie's main web site?]

CF911: Are you finding performance problems with CFDOCUMENT? Aware of the important LOCALURL attr.?

This is something that I find nearly no one has talked about, as a problem and solution. Did you know that by default, a single request doing a CFDOCUMENT may cause CF to execute several additional requests, each doing a CFHTTP to grab any images on the page? even if the images could be found locally on the server? This can be quite tragic.

The good news is that the problem can be solved using the simple LOCALURL attribute. The bad news is that you have to do it at all, and that if you don't do it, it can have such unfortunate and unexpected impact. (And just as bad, again, is that hardly anyone has talked about it.) This entry will elaborate on the issue (and a couple of other possible CFDocument performance issues, as a bonus.)

I've been meaning to write about the importance of this problem and solution (the LocalURL attribute) for a long time (it came out in CF8). Often when I'm helping people with CF troubleshooting problems, whether on mailing lists or in my consulting services, I've been able to show that long-running requests (or an unexpectedly excessive number of requests) were sometimes due to this very problem.

Basics of the LocalURL attribute

Before we go any further, let's start with the details of the attribute (which may surprise many). The following is extracted from a larger blog entry about various CFDocument improvements in CF8 from Adobe Engineer, Rupesh Kumar, back in 2007:

When CFDocument body contains a relative URL, ColdFusion will resolve the relative URL to an absolute URL and will send an HTTP request for this url [emphasis, mine]. A side effect of this is Server ends up sending HTTP request even for local URL or images that are lying on the local file system which obviously hurts the performance. In ColdFusion 8, we have added a new attribute "localURL" to cfdocument tag which if enabled, will try to resolve the relative URLs as file on the local machine.

  • localURL : "true" | "false" It should be enabled if the images used in cfdocument body are on the local machine. This would make the cfdocument engine retrieve the images directly from the file system rather then asking the server for it over http

This attribute helps reducing the load from the server so that the same web server thread can now serve user request instead of serving local images to CFDocument. This also addresses some of the "missing image" problems which I mentioned here. Here is a sample code using this attribute.

<cfdocument format="PDF" localUrl="true">
<table>
<tr>
<td>bird</td>
<td><image src="images/bird.jpg"></td>
</tr>
<tr>
<td>Rose</td>
<td><image src="images/rose.jpg"></td>
</tr>
</table>
</cfdocument>

So why is this an issue?

It would help to make a point of clarification: CFDOCUMENT builds a document (perhaps a PDF or Word doc) on the server, from whatever CFML and/or HTML is within the tags.

As such, CF will need to "get" whatever images (or scripts or css files) are defined on the generated HTML page (as img src, script src, link href, etc.) so it can build the resulting "document" file on the server.

And the point of the dilemma (identified on this page) is that a single CFDOCUMENT with many such img src (or similar) tags will not just cause CF to look to the file system to get the files they point to. Instead, CF will get them via CFHTTP.

Technically, the issue is that CF doesn't necessarily "know" that the location pointed to in the img src and similar tags *is* on the local server. So it presumes it has to get ANY of them using a CFHTTP call. That's what causes the problem, if there are many of them. Using the LOCALURL attribute tells CF instead that it SHOULD look for the files on the local filesystem and NOT do a CFHTTP to get them.

Sadly, this issue is only barely mentioned in the CF Docs age on CFDocument, and it would be easy to miss the point it's trying to make. (It says only that CF "requests the server for images over HTTP even though the image files are stored locally".)

How some tried to workaround the problem with file:// references

While looking around to see who else maybe had talked about this fix (which, as I noted, I found virtually no other references to it), I did find that some people had effectively "solved" the problem by telling CF to use "file:///" protocol references in the img src and similar tags.

Among the entries discussing this were:

It's understandable that people figured it out as a work-around, and perhaps someone even started sharing the idea before CF8 added this attribute, but the LOCALURL would seem generally the way to go.

Some other CFDOCUMENT performance resources

Though not related to the LocalURL attribute, there are some other things that may lead to performance problems.

First, those on CF Standard (as opposed to Enterprise or Developer edition) should know that CFDocument is one of several tags that are single-threaded through CF's "Enterprise Feature Router" (EFR).

Second, there are certainly still other possible CFDocument performance issues, and the following other resources address some of those:

Hope any or all of the info above may be helpful to some readers. Let me know what you think.

CF911: Lies, Damned Lies, and CF Request Timeouts...What You May Not Realize

How often have you seen (or seen others complain of getting) an error from ColdFusion such as:

The request has exceeded the allowable time limit Tag: cfoutput

Do you know what this means? It's usually not what you think. I've even seen experienced CF developers who get thrown by this challenge. In this entry I'll try to help explain a very common problem and correct some misconceptions. I'll even contend that this info is often useless and indeed misleading (and therefore the feature producing it ought not be relied upon, and should even be turned off). Along the way, I'll share some things that I've not seen documented elsewhere.

Strap on your seatbelts. We're going for a bit of a ride (if it was easy and could be understood in the length of a tweet, then perhaps everyone would already understand it!) As always, I welcome feedback.

What the error usually does NOT mean, though most assume it

People are often mystified: "Why in the heck would a CFOUTPUT take a long time?"

Or perhaps they're a little more savvy as to what's happening, and they assume, "No, it's just that the CF timeout time was reached when it got to the CFOUTPUT". That could be.

Sadly though, in most cases, neither is what has happened. CF is usually NOT reporting that "here where the app timed out".

What the error usually DOES mean--the surprise

"OK, smarty-pants. What does the error really mean? Are you saying that CF is lying to me?" Well, often, yes, I'm afraid so, but it's not something nefarious.

Rather, what's more typically the explanation is that some previous activity in the page/request, such as a CFQUERY, CFHTTP, invocation of a web service, or the like is what really took a "long time".

If it was this which caused the request to exceed the timeout (either as defined in the CF Admin Settings page, or using CFSETTING RequestTimeout, or a Timeout attribute on a tag), you'd of course expect CF to report it then and there. The problem is that, often, it cannot report it "right then". And it's not its fault.

There are some operations CF/the JVM cannot interrupt

The problem is that CF (and the JVM) cannot interrupt a request while it's processing what's called a "native method". That is quite typically the mode that a request is in while it's waiting for a reply from a CFQUERY, CFHTTP, and so on. These operations talk to something outside of CF (like a database with CFQUERY, or another server with CFHTTP or web service call--which could even be requesting a page from the same CF instance, but technically the underlying Java httpclient process doesn't know that.) It could also happen with file or network operations.

So the request will wait for this long-running operation to finish. It can't stop it, not with the CF Admin request timeout, not with CFSETTING RequestTimeout, not with the kill features in the CF Server Monitor, FusionReactor, and SeeFusion. Nothing. It's like the Anti-Terminator: "it absolutely will not stop" (can't be terminated) until its task is completed.

So what happens when the long-running operation finishes? Is that when the request times out? An example

"Ok, I got it. The long-running operation (CFQUERY, CFHTTP, whatever) will not stop until it's finished. What happens then?"

Well, you see, that's where the confusion comes in. Let's use an example to make things crystal clear.

Say that the CF admin timeout is 60 seconds (not at all uncommon), or perhaps you set the timeout to 60 for a given template using CFSETTING. Anyway, let's say that the request in question gets 2 seconds into processing when it starts running a long-running query (for example). Let's say that query then takes 75 seconds. When the query is done, the request has now run for 77 seconds, which is 17 seconds beyond the timeout time.

We already know that it won't stop the CFQUERY itself (at least until the query is finished). But guess what: it also will NOT report that the timeout has been exceeded on that line (whatever it was, CFQUERY, CFHTTP, etc.) From my experience, CF doesn't check the time against the timeout at the end of operations, but rather at the beginning.

So instead, it will proceed to the next line of code. You'd think, "ok, then, it will stop on whatever is the next line of code and give you the error there, right?" Sadly, not necessarily, and it only adds to the confusion of the timeout message.

CF checks the time at the start of the next operation, but sadly only on SOME tags

So it's bad enough that it won't report the error on the tag that DID run long. Instead, we saw that it will proceed to the next tag/function. But curiously (tragically), CF will often NOT stop on THE next line of code.

Instead, I've observed that it only seems to check the time (against the timeout) at the beginning of CERTAIN tags, such as CFOUTPUT, CFLOOP, CFQUERY, and so on. Yes, I'm saying that I've confirmed that it will skip over various other tags (such as CFSET, or CFSCRIPT code, and more). I've not yet found any documentation as to the details of this.

So this is where the error gets confusing

So the bottom line is that not only does the the request NOT stop on the tag/function that took a long time, it doesn't even stop on "the next line" after that, which can make things all the more confusing/challenging to resolve.

Indeed, this is why you often see the error reporting as having occurred on a tag other than what was really the problem, and why you also can't just look at whatever was *the* line of code preceding that.

So what can you do with this information?

I don't mean to paint an entirely bleak picture. All is not lost. It's just a little more challenging than it should be.

At least first of all you can now know that when you see this error, you should NOT assume that it's reporting the line that really caused the problem. You can and should consider whether some earlier operation in the code could have taken a long time. In my experience, this is usually the situation.

I'll talk in a moment about some other tools that can help you understand where the time is really being taken. First, I do want to offer a clarification, lest anyone read my meaning too literally.

Are you saying the error message is always lying?

Well, no. You'll notice that I peppered my opening paragraphs with "usually", because it's certainly possible that a request could indeed be stopped on the very line that DID exceed the timeout.

Consider in our example that if the long-running query had run for only 57 seconds. Now, since it had taken 2 seconds before that, it now is one second short of timing out. Let's say the request then proceeds to loop over a query resultset or do some other operations that might take it a couple more seconds. When it does finally exceed the timeout, it may well happen right on the very tag that CF Reports as having "crossed" the timeout time.

But given the problem of how it only reports that on some tags (and not all), it could still be in this situation that it reports the wrong line of code. Just consider all the above as you evaluate what to make of the situation.

So how can I know what tag did take a long time?

So how can you know what tag is taking a long time, when a request it running long? or did take a long time, if it finished in the past? This is a bit more challenging. The good news is that there are tools that can help, including the CF Enterprise Server Monitor, FusionReactor, and SeeFusion.

Let's focus first on using these tools to catch requests while they're still running, which could be valuable if your server is hanging up because of some long-running requests. Then we'll talk about using the tools to capture the same information and make it available by email to review later.

The underlying feature/solution: stack tracing

In either case, whether watching requests live or capturing information about them to review in the future, and in all three tools, the solution to identifying why a request is long-running will be based on "stack tracing" that request.

This is a feature built-into the JVM, which is exposed easily by these tools, but missed entirely by many. More than that, some misunderstand stack tracing as something only shown at the bottom of error pages. (That is indeed a stack trace, but it's not nearly as useful as what I'm referring to here, which is for getting information on request while they're running, not when they have had an error.)

Stack tracing a running request will allow you to see exactly what line of CFML (if any) is running at that very moment, which again can be vital for resolving problems of long-running requests.

CF Enterprise Server Monitor

First, if you run CF Enterprise (8 or 9), you can use the CF Server Monitor to watch requests while running and see more details about them. If you use the available "start monitoring" button, you can see what requests are running in "Active Requests". Further, if you enable "start profiling", then if you double-click a running request, you can see in the middle of the next page a "stack trace", which shows the exact line of code that was executing at the time you double-clicked the request.

(Yes, I'm aware of the potential overhead of using the Server Monitor, though some people do over-state it in my experience. I'll point to other resources I've done on the Monitor, where I discuss its pros and cons, in a moment.)

Of course, viewing that stack trace at a random point in time during the life of a request could well mean simply that you'd see it executing just any random line, where perhaps a millisecond later CF will have moved on to another. The key is to refresh the stack trace, to see if CF indeed HAS moved on to a new line. If not, that line would be a smoking gun to investigate. Sadly the refresh icon in the Monitor doesn't update the details while viewing a running request. You need to go back to the list of active requests, open the request again, and repeat your observation.

Tools like FusionReactor and SeeFusion

Fortunately, tools like FusionReactor and SeeFusion make that refresh a lot easier, to obtain a stack trace while request is running. Each offers a button to take a stack trace of a running request. From the page they show you can usually determine the line of CFML code that's running (they each offer a little more stack trace detail than the CF Server Monitor does, but I'll point you soon to a resource to help you better understand them.)

More important, each of these tools offer a refresh button to refresh the stack trace, so that you can properly determine if the line that's executing has changed while you're refreshing.

That said, I will note that FusionReactor offers an important advantage with respect to that refresh operation: it ties the stack trace display to the specific CFML page that was being viewed (in Running Requests page) when you selected it. So if that request ends while you're looking at its stack trace, and you refresh it, FusionReactor will report that it's finished.

SeeFusion, on the other hand, would not. It knows only the thread id on which the requested page was running, so that if the request ends and you refresh the stack trace, it only knows to refresh the stack trace for whatever request is running on that thread. It can't (and won't) tell you if the given request has in fact ended, so you could now be looking at a new (and different) request, which could be quite confusing in this situation. It's incumbent upon you to notice (when using SeeFusion) that the stack trace you see in indeed for the same request you started with. (FR gives each request its own internal request id, which is how it avoids that problem.)

Catching running requests details when you're not watching the monitor tools

Of course, it's only possible to use the stack tracing features above if you can be on the server running the monitor tools when the problem occurs, right?

Well, not exactly: all three tools offer features to watch for a long-running request which can then send you notification by email of the details that would include a thread dump, which is a stack trace of all running requests.

In the CF Server Monitor, these are called Alerts. FusionReactor refers to them as Crash Protection notifications, and SeeFusion refers to this as "Active Monitoring Rules". See the documentation for each tool to find more information.

Learning more about stack tracking and the monitor tools

For more on all this, I discuss the idea of taking stack traces and thread dumps (which is a list of all stack traces for all current threads) in another blog entry.

I also discuss the CF Server Monitor, FusionReactor, and SeeFusion in several blog entries. The links just used are to the respective categories about each here in my blog. I've also discussed these topics (monitoring, stack tracing, and more) in various articles and presentations I've done.

The step debugger

Finally, some may point out that you can also get an idea of the time spent on any tag/function within a request if you use the interactive Step Debugger (whether that built into CFBuilder or the commercial FusionDebug alternative). As you step through the code, it would be clear if you got "hung up" on a line, though I don't know that I'd favor this as a solution here. Still, I've discussed these also in various blog entries, articles, and presentations.

(Sadly, you can't rely on the typical end-of-page debugging output, as enabled in the CF Admin, because that output is only shown if the request completes. We're referring here to pages that end in error.)

Can't I force CF to timeout some specific tags?

Again, in my experience (as I focus on CF server troubleshooting as a consultant), the root cause of problems in most "long-running" requests in CF is that some one tag or function is running long.

So could we perhaps force CF to timeout that specific tag? Well, yes and no.

You can, in fact, (and should) consider whether the tag in question might have its own TIMEOUT attribute or feature (and whether it will really help, as I'll explain.) Let's look at each of them.

Setting Timeout on CFHTTP, CFINVOKE, and others

There is indeed a TIMEOUT attribute on CFHTTP. Unfortunately, it won't ALWAYS keep the operation from exceeding that timeout. I've not quite put my finger on it (just haven't experimented completely), but if I had to guess, I'd say that it could be that if the operation is in the midst of returning data (from the server to CF), then it could perhaps time it out, whereas if it's waiting for the output then it may not be able to. Anyone know for sure?

There is also a TIMEOUT on CFINVOKE (for use when calling web services). Curiously, though, there is no TIMEOUT for use with CFOBJECT when calling web services (try it, it won't work, and none is documented). More curious still is that there IS a timeout available for use with createObject() (when calling a web service), though only by way of an argstruct argument that's new in CF8, which I have blogged about. Note as well that, according to the docs, that only times out the process of obtaining the WSDL, not the execution of any method in the web service.

There are also timeouts on various other operations that talk to something outside of CF (cfmail, cfftp on open/close operations, cfldap, cfpop, cffeed), though again it seems reasonable to expect that these may not always honor the timeout at the exact time given, as discussed above.

Setting Timeout on CFQuery

What about the elephant in the room, CFQUERY? Well, yes, it does have a TIMEOUT attribute, but many have found that it often does not timeout the query. Like the CFHTTP, I wonder if it may be a question of whether it's waiting for output (which likely can't be interrupted) or starting to receive it (which likely can be).

I will note that there's some promise in this regard, though for now not from CFML itself, but rather from the updated JDBC drivers in CF9 and the addition of a new timeout option in the CF Admin Datasource Advanced Settings page. You'll see that there is a new "query timeout" option that was not in CF before 9. I have blogged about it in more detail. It's not perfect: people are reporting different experiences with it (see the comments in the blog entry), and note (more important) that for now there seems no corresponding connection between this and the CFQUERY TIMEOUT attribute. (As I note there, I have raised a bug about this.) Still, it may be better than nothing and could help many, if you're on CF9.

So is there really nothing I can do for the hung requests?

OK, so we've explained why the requests don't timeout, often because they're talking to some remote process that is not responding. But what CAN you do when you're in this boat? Well, other than trying to add timeouts to the code as discussed above, generally nothing, at least for the requests that are already running.

And certainly a restart of CF will kill them off, or at least stop CF trying to talk to the remote process. (Of course, it's possible that upon restart, new requests will come in and try to connect to the same non-responsive or slow-responding remote process, so it could come right back.)

Stop the request on the remote server

But while you can't do much from WITHIN CF for these hung requests, there's one other way you may be able to stop the madness: stop the request on the remote server.

Once you can determine exactly what tag it is that's hung up (with the stack tracing tools above), you could then target whatever it was waiting for: the database server, a remote page called via CFHTTP, an exchange server using CFLDAP, etc.

Since the tools that let you stack trace the running request also show you the time the request started, you could use that info to go to the administrator of whatever service you're calling and ask if THEY may be able to kill the request. As soon as what you're waiting for stops, the CF request will continue. (Of course, it may only continue for a few milliseconds before it will be timed out by CF, as I discussed above, which is why I'm no fan of the CF request timeout feature, and think it should be turned off. More on that in a moment.)

Beware: you may not always find the remote server still "hung up"

Back to this issue of finding and killing the remote process that CF may be waiting for (that's causing your hung request), I should note that there may be times when you would go to the remote administrator and say, "look, I have this long-running CF requests that's waiting for this process (query, ldap request, web page, etc.) that is waiting forever for something that is running long on your server". And they may look and see nothing on their ends that's running long. Doh!

Yep, it can happen, for various reasons, so just be sensitive to this. You may really then have no way at all to kill the hung request. But note, again, that you may be able to use this observation to do something more to prevent the problem in the future, perhaps on the remote server side.

For instance, I've heard some describe problems where CFQUERY processing has hung talking to an Oracle database and (if I've got it right) the problem is an inconsistency between the CF datasource connection timeout and Oracle's "session" timeout. If anyone has more details on that, please do share.

But my point is simply that CF may be "waiting" for a call that will never be answered and can't be terminated from the other end. Again, in such cases, you can only kill them by restarting CF, and then you need to investigate how/why the call to the remote server are getting hung up in the first place. That's where logging information for diagnostic purposes may really come in handy, as is discussed next.

Logging what CF is getting hung up, to show to the remote administrator

If this problem (of calls to remote servers that take too long or get hung up) is happening often, and/or you can't always be logged in to see when it's happening using the tools above, another idea is to log for yourself whenever you make such a call to a remote server (that you know tends to hang up), such as putting a CFLOG statement before and after the CFHTTP, CFLDAP, CFQUERY, etc.

At least then you'll be able to see when it does and doesn't take a long time. The log would also help you by showing when it logs a start but no stop.

You could also code it so that it only logs when it's slow, but being able to confirm that it's generally fast and only sometimes slow may be itself useful diagnostic info.

Note that CF 9.0.1 by default adds new logging that does automatically log the start and end of calls to cfhttp, cffeed, and more to corresponding new logs (http.log, feed.log, etc.), which could also help.

Finally, as for logging the queries, you can get that from FusionReactor and SeeFusion automatically, as their "jdbc wrapper" features allow you to log every query (or optionally only those slower than a certain time). There is also a new "log activity" feature in the "advanced settings" of a CF datasource definition that could also log DB activity, though it is quite verbose and a tad unwieldy (not one line/row per query like the other two tools).

Bottom line: I'm no fan of request timeout features

So all that said, I'll repeat and clarify that I'm no fan of request timeout features, not that in the CF Admin, nor that offered in CF monitoring tools that offer to "kill requests" automatically, like the CF Server Monitor Alerts, FusionReactor's crash protection, and SeeFusion's active monitoring rules. I don't think they should be used, personally.

Let me be clear: I do love those tools and use them and help people use them daily. And I do love and highly recommend the features in those tools for sending you *alerts* when requests exceed a given time. What I don't like is them trying to kill them automatically, for all the reasons I outlined above. So I tell clients to turn off the "timeout requests" feature (though it does still make sense to use TIMEOUT attributes on certain tags, or may make sense to implement the CFSETTING RequestTimeOut on some page where you know that the reason it runs long is not one of these things that can't be killed anyway.)

Instead, I recommend (and help my clients daily) to use the alert info (from the CF Server Monitor, or FR or SF) to be notified if/when requests ARE taking too long--and NOT to kill them. Note that these tools all send the notice *as soon as* the request takes too long (whatever time you set), whereas CF's "log slow requests" feature only logs when requests end--and that's only IF they do end without failing.

So yes, get notified that requests are taking too long. Use the info in the alerts, which includes the stack trace info I discuss above. Do find and resolve the problem. Don't rely on (or in my opinion even use) auto-kill features, when in fact they nearly never are able to kill really problematic requests anyway.

Yes, yes, I do realize that there are some requests that CAN be interrupted by these timeout/kill features, but I'll assert that such requests are far less commonly the cause of any serious problems. Your mileage may vary, of course. But I make my statement based on several hundred instances of helping folks solve typical CF server problems.

So why is the "timeout requests" setting there?

One last thought worth considering: someone might reasonably ask, "Charlie, why are you such a hater of the setting? If Adobe has it there, it must be for a good reason."

Here's what I'd say to that: sure, when CF originally ran on C++ (prior to CF 6), perhaps this setting could be reasonably relied upon to ensure that requests would not run any longer than the set time. (I don't recall, but perhaps even then there may have been at least SOME tags that it couldn't interrupt.) But clearly since CF 6, in the Java model, this is no longer the case.

And yet if you read the Admin page, or its help, or the docs, or the comments from nearly anyone who considers the setting, the presumption is that this WILL stop requests from running longer than the x number of seconds indicated.

Why am I so impassioned/manic about this?

I hope I've made clear in this entry why I think that's not only wrong to conclude (in nearly all cases), but worse it sets up a tragic misconception of how CF works. If you think this should and will stop long requests (or that the alert features of the monitors will kill them), then you're going to be in for a shock when requests do hang up for an extended period of time. What are the implications?

  • You may totally under-estimate how many simultaneous request threads you should enable.
  • You may never pay attention to tools like CFSTAT or jrun metrics (to observe at least "how many requests are running" at any given time), which will help you see if/when requests are hung.
  • You may never bother to learn how to use the CF Server Monitor (or FR or SF), all of which can go still further and show not just how many requests are running (possibly hung) but a) how long, b) what the URL is, c) what the IP address is, and so much more, which can help you find and resolve problems.
  • You may never bother to learn how to do the stack tracing that I discuss above, which is often vital to understanding where and why any given request is hung (or was at the time an alert was thrown)
  • You may never bother to analyze logs that show the activity patterns (how many requests are running at periodic intervals, such as the FusionReactor "resource log" reports.) It's really THAT information that is vital to your understanding what to set for your simultaneous requests setting.
  • and so on

All of this info (and understanding) is VITAL to a very important and common class of CF server troubleshooting: why is CF up but not responding? It may be that requests are hung.

But if you assume, "well, they can't be running any more than x seconds", then you'll start to think "so it must be something else", and you figure you may as well just restart CF. Or you start reading about how someone suggests you change your JVM settings (which may have NOTHING TO DO with this problem, and not only not solve it but could cause new ones), and so on.

Again, I see this all the time. I hope by this entry to have helped avoid some of the very common misunderstandings on this subject that I frequently see either on lists, or in emails to me, or in my consulting engagements. If I seem passionate about it, it's because I am. Same with the memory issues I discuss in the related and similarly titled entry, CF911: Lies, damned lies, and when memory problems not be at all what they seem, Part 1.

Need More Help?

I mentioned above that I provide CF Server Troubleshooting consulting. If you need some help understanding how to apply the information above to your specific problem (or need help with any CF server, or CFBuilder, problem), I'm happy to help.

I don't need to come on-site, nor do you need to give me remote access. Instead, we can work easily and securely right over the web using Adobe Connect.

And I don't have any minimum time-block requirement--and I even offer a satisfaction guarantee. To learn more, including rate plans, see my consulting page. (I hope some will forgive this brief commercial here. I don't generally mention it, but since some say that they didn't know I offer such services, it seemed an appropriate point to mention it.)

Conclusion

So phew, another really long blog entry. But I hope it may help some people (and help some who help others).

As always, I welcome your feedback, corrections, additions, etc. Really, I ask for your feedback. If it helped, please say so. My blog doesn't get the traffic of many others. I often see that hundreds of people have read things, but few ever comment. I can't know if it's that I've answered every question (I can hope so), or that you weren't impressed. Like the guy said in Dirty Harry, "I gots to know". :-) Sometimes, all it takes is a few people to "prime the pump" and start commenting to lead others to do so. Why not grab the handle? :-) And if you think this would be helpful info for others, please do share it (tweet about it, mention it on mailing lists/forums when you see the problem raised, etc.)

I'm planning to better organize and package CF server troubleshooting resources (mine and others). We have a lot of great info out there for those solving CF problems. It can just be a challenge to sort through it all. I hope to help solve that. Look for more news to come on that front in time.

New for CF9 (and 9.0.1): a query timeout that really works, with a caveat

This is a very interesting change in CF9 (and 9.0.1), which has slipped under the radar for the most part as far as I can tell.

Did you know there is now a setting in the DSN page of the CF Admin (for most DB drivers) that allows you to set a maximum timeout for queries against that DSN? It's a new feature enabled for the DataDirect drivers udpated in CF 9. The caveat? It is ONLY settable there, not in CFQUERY itself, which is a shame (the existing TIMEOUT attribute is not the same and generally does not work). Still, the value of this even at the DSN level is too important to ignore for some challenges. More on that (and some other thoughts) in a moment.

As for the setting, it's in the "Advanced Settings" section for a DSN in the CF 9 Admin, and it's called "Query Timeout". This should not to be confused with the older settings, "Timeout" (which is about inactive connections) or "Login Timeout" (which is about logging in to the connection). The screenshot at right shows all 3. (This blog entry continues, with more information below it.)

I've run a test, and it really does do the job, which is huge. Why? Because it's been a long-time issue that if a CFQUERY got hung up waiting for a response, that request thread (doing the CFQUERY) is then hung until the query finishes, which can sometimes be many minutes, or even hours or days, due to some odd situations. More important, a thread waiting for a query with no timeout can't be terminated (by the JVM, or CF, or the monitoring tools) because the thread was in a native thread state.

With this new option specified, if the request exceeds the timeout, the request does now fail, with a JDBC error, "Execution timeout expired." The same test does NOT timeout with the older cfquery TIMEOUT attribute.

Here are some other notes on the new feature:

  • It works with most of the database drivers. I have confirmed that the setting appears in the DSN settings page for SQL Server, Oracle, MySQL (datadirect), DB2, Informix, Sybase) in CF 9 Enterprise, and in both SQL Server and MySQL (Datadirect) on CF 9 Standard.
  • The timeout's specified in seconds.
  • You can learn more in the CF Admin guide, in this specific page.
  • Oddly, the Admin manual page above only references this new setting in that CF Admin manual is in the the MySQL settings, but again it does appear in all the drivers above.
  • The manual page does even reference the other DBMSs by name in its naming the methods of the Admin API (for other DBMSs) which you can use as well, which can be used to set this default setting in the DSN programmatically.
  • That said, again there is sadly no new QueryTimeout (or QTimeout) attribute for CFQUERY, so for now we can only set this at the datasource level, not per query.
  • I've raised the concerns above on the CF Admin livedocs page (or whatever we are to call it now.)
  • If you look under the covers (in [cf]\lib\neo-datasource.xml), there is in fact a querytimeout connectionstring that this setting controls. If only there was a way we could pass connectionstring values to the CFQUERY, we'd be golden. Some may recall use used to have just such an attribute (ConnectString), but sadly it was deprecated in CF 6. I did try it, to no avail.
  • I will raise have raised a bug with Adobe to get a new attribute for CFQUERY related to this. When I do that, I'll report it here. It's bug 83592. Please add your vote of support for it.
  • There's another page on this Admin setting, in the CF Dev Guide, for those who may be interested in following any other possible places where this feature may be discussed (in the comments there).
  • If you want some code to use to test a request waiting a long time for the DB to return, most databases offer a statement that tells the DB to wait for some time. In SQL Server, that's WAITFOR DELAY 'hrs:mins:secs'. Just use that in a CFQUERY, assuming your DSN definition doesn't limit what SQL statements you can use, in the 'Allowed SQL' section of the Adcanced Settings page.

Why is this setting important?

I think this is a very important setting, and though it has been a hidden gem, it seems, it's one that people should consider. That's why I've flagged this in my CF911 category.

If you're suffering situations where requests are hanging due to long-running queries, and you have not been able to solve the real root cause for why they are hanging (which I always recommend first and foremost), then at least this option can help avoid a situation where queries can run without time limit. An Admin can decide that no queries should be allowed to run more than x seconds.

With that power comes responsibility, of course, and caution. You wouldn't want to preclude someone being able to run a query that really needed to take a long time. That's why it's really better if this was settable on a per-query basis. (And no, the CF page timeout settings are NOT the solution here, because again as I said above, they cannot timeout some kinds of long-running tags, like CFQUERY, CFHTTP, CFINVOKE of a web service, etc.)

What one could do, though (for now), is create different DSNs, where one could be used for most query processing, and another could be used for long-running requests. Yes, it's ok to have 2 DSNs point to the same DB. This same technique has been used when wanting to have most queries run against one DSN with a limited set of "allowed SQL" (per the DSN advanced settings) while another DSN has unfettered SQL access.

Hope this is helpful to some. Let me know what you think, whether this was helpful or if you feel I left something out. Especially please let me know if you may know of a way that we can indeed pass querystring values on a per-query basis.

New DataDirect JDBC Type 5 drivers (for SQL Server, MySQL, Oracle, and more)

I think most folks know that the underlying database drivers in CF are from DataDirect. Well, they've announced new "Type 5" drivers. While you would have to buy and install them separately from those built-into CF (for now, as Adobe has not yet certified CF for use with them), I think some people may want to give them serious consideration even before then.

Several Performance Advantages, and Failover As Well

Among their advantages (over the older Type 4 drivers that are built into CF) are faster performance and lower memory use (more here), as well as greater tunability in general (discussed among other features between 4 and 5), and still more.

For some, more compelling still is the new driver-based failover, to have the driver detect a connection failure and switch communications to another DB instance (much like CF 8 Enterprise added the ability to do the same with mail server configuration in CF). This could be a huge improvement for many.

Getting More Info, Trials, Purchase

You can learn more about (and purchase or get a trial of) the drivers at http://www.datadirect.com/products/jdbc/index.ssp. The page also includes links to more on the benefits of type 5, comparison to Type 4, and so on. (Sadly, I just missed their cleverly titled "These go to 11" webinar earlier today. Perhaps they'll offer another soon.)

When Might Adobe Bundle Them in CF?

You may have caught that I referred to "purchase" of the drivers. Sadly, no, the drivers are not free. Adobe has licensed and included them in past releases of CF. So when Adobe may modify CF to bundle the new drivers?

Some have heard that Adobe's working on an updater for CF 9 (mentioned publicly at cf.Objective() among other places). Can we expect the drivers to be updated in that release? It's hard to say. In the 7.0.2 timeframe, the then-new drivers (3.5) came out just after the release and so were not included in the updater but instead were mentioned as a manual option in the CHF 1 technote. (Many missed that and to this day I suspect many shops still running CF 7 still have updated those drivers.) The point is, they did not include it with the updater (as I blogged about at the time in this blog entry.)

I can understand why they may not bundle new DB drivers with a CF updater: with updaters, they have to rebuild and retest all the installers on all the platforms, and updaters are generally not about adding "new features". It's a slippery slope as to whether driver updates are "new". It may also be a matter of timing, with respect to when Adobe learned about the drivers (likely sooner than most of us) and whether they found them compelling enough to consider, and at what cost in terms of backward compatibility. It's worth noting that the Type 5 drivers do assert to bring all their benefits "without code changes", so who knows?

The bigger question for some may really be simply, "I'm ok if they make me install it manually. It would just be nice to not have to license them ourselves." That's a fair point, and we just won't know until Adobe does address this issue (perhaps as they did in 7.0.2 CHF 1.) Until then, we have to wait and see.

But again, I do think the update compelling enough that shops with interest in performance should at least consider testing out the free trial to see how things go, or to prepare for the possibility of leveraging that new failover feature. (It also remains to be seen if Adobe may somehow try to force that failover feature to be an Enterprise-only feature, just as the mail server failover is.)

About the Company and Product Name Variations

Some may wonder if they visit the site for the drivers and notice the company being named Progress, when it used to be Merant. You can read about the fairly long chain of changes in the company history page.

One other thing worth noting is that the product is sometimes referred to as "DataDirect JDBC Type 5 drivers" (as I show in the subject here) and yet also sometimes as "DataDirect Connect for JDBC". They're the same thing.

One Boo Boo on the DataDirect Site

As I was tooling around their site, I happened on the page about use of the tool with JBoss servers. They said this:

The top web and application server vendors embed or recommend DataDirect JDBC drivers as part of their J2EE certification strategies (JBoss, IBM WebSphere, Sun Microsystems, Oracle BEA, and Macromedia.)

Can you spot the mistake? There is no more Macromedia, of course. I tried to find a page on the site to send a comment, but their Contact page didn't seem to offer a suitable option for such a comment. Oh well, so I am adding it here in case someone who cares can get something done about it.

Finally, I'll point out another useful page on their site, not new to the Type 5 drivers, but a helpful list of performance tips.

Having problems with SQL Server/Oracle/DB2/Sybase? Check out Confio Ignite

Hey folks, if you're having problems with your CF apps and you determine that (or wonder if) the cause may be due to problems in the database, check out Confio Ignite, a commercial tool that may be well worth the price for you.

Sure, there are many DB monitoring tools out there, but Ignite focuses on tracking, analyzing, reporting, and explaining wait events within the database--and you'd be amazed how often waits caused by your code, that of others, or from other operations in the DB are the explanation for poor performance. It can help target exactly what SQL statement or other operation is a cause of significant waits.

The tool presents the data aggregated over time, so you can view it per hour, day, week, etc. Great for both drilling down to find hot spots, and for viewing how coding/config improvements (resulting from your responding to the analyses) have led to performance improvements over time.

The tool runs with low overhead: it reads data that the DB provides, storing it in a database and providing a web-based interface to view that data. The process to read the data and create the repository (and present the web-based interface) can (and should) be done on a server separate from the server being monitored.

Here's a nice 2-minute demo. There's also a free trial, of course, and it's pretty quick and easy to install and benefit from.

As I noted in the title, it works with SQL Server, Oracle, DB2, or Sybase (sorry, not MySQL. Don't know why). And while it's a commercial product, it's not a ridiculously high price (as for some tools). I just learned of it in the past few weeks, and one customer of mine who tried it has been just thrilled with the results. I hope to write more about it later, but wanted to at least get this info out for folks to consider.

Resources for getting into the Multiserver (multiple instance) implementation of CF

You may have heard of the new Multiserver deployment option that was introduced in CFMX 6.1, also known as "multiple instances". It can bring tremendous performance and reliability improvements, allowing you to segregate apps on a single server either by function, or reliability, and so on. It can also help you manage memory more effectively.

Since many people may only be considering the feature now (either only now moving to 6.1 or 7, or 8), I want to share some resources if you're new to it. (The question came up on a list, and I offered the info there, so thought I'd pass it along here.)

First, there are a couple of articles from that time 6.1 frame:

"Introducing Multiple Server Instances in ColdFusion MX 6.1", by Tim Buntel

"Using Multiple Instances with ColdFusion MX Enterprise 6.1" video (sadly, seems no longer available)

Now, CF7 did introduce the new Instance Manager within the Admin, and that (and instances in general) is covered in the CF manual:

Configuring and Administering ColdFusion MX (Chapter 7, "Using Multiple Server Instances")

Finally, there is also a new Adobe article as of CF7:

Multiple Server Instances using ColdFusion MX 7 Enterprise Edition

(Update: There's also now a CF8 version of that: Multiple server instances using Adobe® ColdFusion® 8 Enterprise Edition. The technical content seems identical, but it appears to have had considerable editorial updating.)

There are certainly other articles folks have done in the CFDJ or at CommunityMX.com, but these should get you started.

Even though it's old news to some, it does seem that like many things, use of instances is something that may have been missed by folks. I've been contemplating a new user group presentation on the topic. Nothing new for CF8, but it seems people are considering things now that they may have ignored when 6, 6.1, or 7 came out (which is why I did my daylong class at CFUnited on what was new in 6 and 7 that folks may have missed).

One last point, if those don't make it: if you're running on Windows, don't try to create an instance with a JVM heap greater than about 1.3 GB. Though Windows should allow 2GB per app, this is just a number many found that beyond which CF won't start. Hope that all helps.

New (free) Performance Dashboard for SQL Server 2005 SP2

Those using SQL Server 2005 may want to take note of a new "Dashboard Report" option for SP2, to help monitor and resolve performance problems, including capturing diagnostic info when a problem is detected.

Common performance problems that the dashboard reports may help to resolve include:

  • CPU bottlenecks (and what queries are consuming the most CPU)
  • IO bottlenecks (and what queries are performing the most IO).
  • Index recommendations generated by the query optimizer (missing indexes)
  • Blocking
  • Latch contention

The report is an extension of the Custom Reports feature introduced in the SQL Server 2005 SP2 release of SQL Server Management Studio. Note that Reporting Services does not need to be installed.

The reports retrieve info from dynamic management views. They don't poll performance counters or require tracing be enabled. They also do not store a history of performance over time. So it's a lightweight (yet powerful) monitoring option.

You can get the extension itself at:

Performance Dashboard Reports

There's also a complete article about how to install it from Black Belt Administration: Performance Dashboard for Microsoft SQL Server, Part I , at Database Journal (and from which I obtained the image above).

Charlie Arehart offers new per-minute phone-based support service

Ever felt stumped trying to solve a CF problem? You've tried everything? Searched the web? Asked on forums and lists, and you're still stuck? Or maybe you're just pressed for time.

Maybe you've wished you could hire someone with more experience but can't justify a high hourly rate. Of course, with so many web tools available to share a desktop, travel need no longer be a significant issue. Sometimes, it could help to simply have someone "look over your shoulder" via the web to resolve a problem.

Recognizing all those challenges, I've created a new service that I'm tentatively calling "AskCharlie", to be able to offer just such assistance.

Via buttons on my site or an 888 number you call, you can arrange to speak with me by phone (and optionally join me in a shared desktop session) to solve some knotty problem.

Best of all, it's very low cost and at a per minute rate (first-time callers can use a 10 minutes free option, and everyone gets a money-back guarantee). That and more are explained on my site:

http://www.carehart.org/askcharlie/

There you can learn more about why I did it, how it works, how the remote desktop sharing assistance works (no problem with firewalls, for instance), and more.

I'd welcome your thoughts on what you think of the idea.

Need to know how long your ColdFusionMX instance has been up?

If you've ever needed to know how long your CFMX instance has been running, did you know there's an incredibly simple solution? It's not really obvious, and in fact it may not work on trial editions of ColdFusion (for reasons explained below), but it's worked on the Enterprise and Developer editions that I and others have tested it.

Some may have noticed that there is a server.coldfusion.expiration variable (one of many read-only variables in the server scope, within the coldfusion structure). According to the docs, it's supposed to represent the date on which a trial edition of CFMX will expire.

But some have found (and I've confirmed) that on non-trial editions it instead reports the time that the server started. Try it youself:

<cfoutput>#server.coldfusion.expiration#</cfoutput>

It'll be a date, but is it in the future? or the past? Assuming it's the past, and you're not running on a trial edition, then it's likely the time your server had started. If you could restart your server and test again, you'll know for sure. :-)

Better formatting of the result

Now, if you want to easily determine how long the server has been up, just use the datediff function to report how many minutes there are between the time reported and the current time:

<cfoutput>#datediff("n",server.coldfusion.expiration,now())#</cfoutput>
Now, that number will be in the thousands after only one day. If you just want to better format the number of minutes, you could use the numberformat function which (without any formatstring) simply adds commas where they would be appropriate:

<cfset uptime = datediff("n",server.coldfusion.expiration,now())>
<cfoutput>#numberformat(uptime)#</cfoutput>

But even better, you could use one of the nifty functions available at the cflib.org that could help present the result minutes in terms of days, hours, and minutes (duration()). Assuming you included the latter in your page, you could then output the value as:

<!--- assuming you have included or pasted the duration UDF --->

<cfset uptime = duration(server.coldfusion.expiration,now())><br>
<cfoutput>#uptime.days# Day(s) #uptime.hours# Hour(s) #uptime.minutes# Minute(s)
<br>Server started on #dateformat(server.coldfusion.expiration)# at #timeformat(server.coldfusion.expiration)#
</cfoutput>

I may put together a UDF to submit to the cflib to pull all this together, and which also checks if indeed the server being checked is a trial edition, etc. But I'll look forward to hearing people's feedback first.

Other options

Of course, it's worth pointing out that there are of course many tools that will monitor a server's uptime by watching it externally. It may also be possible to ask the operating system to report how long a given process or service has been running. I'll leave that for others to investigate.

Finally, I'll point out that there is indeed a tag in the Adobe Developers Exchange, CF_UpTimeMonger, that purports to help detect server uptime and says it works even for CF 4 and CF5. I have not explored it.

BlogCFC was created by Raymond Camden. This blog is running version 5.005.

Managed Hosting Services provided by
Managed Dedicated Hosting