Note: This blog post is from 2014. Some content may be outdated--though not necessarily. Same with links and subsequent comments from myself or others. Corrections are welcome, in the comments. And I may revise the content as necessary.I often help people who are reporting that CF is "running hot on the CPU", maybe reaching 80 or even 100% of the CPU, whether in spikes or for extended periods. What might you propose people look at, when you've heard that? I've heard all kinds of things over the years, often focused on coding, or perhaps jvm tuning.
But as is often the case in a lot of the CF server troubleshooting consulting I do, I find the causes to be far less often what most people seem to suspect. So what would I look for when someone reported high CPU in ColdFusion (or Railo)? Read on.
(BTW, from here out I will just mention CF for the sake of convenience, but what I offer applies just as readily to Railo as CF, except for one variation below.)
Make sure it's really CF using the high CPU
Let me say first, of course, that it's vital for folks reporting this problem to be really clear that the problem is indeed an issue of high CPU within CF itself. I've had more than a few occasions where people said the CPU was high, and it turned out it was NOT CF after all. It was something else on the box, but they assumed that "all the box did was CF", so they didn't bother to confirm if it really was CF.
I've seen virus scanners, security scanners, backup tools, and all kinds of other things steal CPU. (Of course, a database running on the same box as CF could be a killer of CPU, but nobody does that these days, right? Trust me, people do, all the time!)
That said, there is one possible cause of high CPU that COULD be something related to CF but running high CPU in some OTHER process, but because of your having been hacked. More on that later.
So what could cause high CPU? First and foremost: running out of memory
This may not seem obvious: why would high memory use contribute to CPU problems? But CPU use can indeed be very high when the heap within CF reaches its upper limit (your maxheap size in the CF Admin) and the JVM (underneath CF) is thrashing about, doing excessive garbage collection and perhaps heap dumps, as it suffers from outofmemory ("oom") errors.
Do you have to put a Java memory monitor on CF to see this happening? Not really. And in fact, by the time any monitor finds that there is an OOM condition, it will often be too late to do anything about it except restart CF. So the trick instead is to look in CF's logs, either while the problem is happening or once you recover/restart.
To find this sort of evidence, look in the ColdFusion-out.log and/or ColdFusion-error.log (in the [cf11/10]\cfusion\logs folder, or the [cf11/10]\[instance]\logs folder if using an instance in CF 10/11 Enterprise, or the [cf9]\runtime\logs folder, or [jrun4]\logs, or if on*nix, see the cfserver.log in the CF logs folder). You want to find the log for the instance in question, and you want to focus especially on any errors in the log BEFORE CF crashes or you restart it.
Here's a trick: do a search for the phrase "ColdFusion Started". That's among the last lines CF shows as it's coming up. (If you need to search across many of the logs, there are OS tools to search files, but on Windows I recommend the nifty free tool, File Locator Lite, which I've written about before.)
Then having found that spot in the log where CF is starting up (after a crash), look at the lines BEFORE CF started to come up--it may be a page or pages above that line, to see if you find any "outofmemory" error lines in either of those two logs, in the seconds or minutes (or longer) before CF crashed or was restarted. (You can also look for *.pid logs in the same directory as the jvm.config file. These are hotspot crash logs and often are related to memory problems.)
So why might memory be high in the first place? Well that's really a subject for an entirely different blog entry.
But let me say first and foremost that it's not unusual for sites (especially with high bot traffic) to have exorbitantly high session counts, which can contribute to high memory use, especially the longer your session timeout is, and the more you put into your sessions. I blogged more on how to find a current count of sessions within CF (without any code changes) here: Tracking number of CF sessions per application easily, and why you should care.
You could also have high memory/CF heap use due to your (your application's) use of CF query caching, CF's enhanced eh-caching features as added in CF 9, to name just a couple of things. And note that in CF10, each of these is not stored per application, whereas previously they were cached instance-wide.
What else could cause high CPU? Excessive Client Variable processing (don't skip this!)
Moving on to the second most common reason I see for high CPU in CF, I'd suspect excessive client variable processing, especially during the frequently recurring "purge" process, which by default happens in ColdFusion every 67 minutes (after the previous one), as CF purges long-unused entries from any of the repositories listed in the CF Admin Client Variables page (other than cookies, so meaning the registry and any listed datasources).
Now, before you skip this thinking, "we don't use client variables", think again. Or better, take my word for it and do the following check to find out if it's impacting you. I have helped MANY folks who SWORE they did not use client variables, only to find out not only that they did use them (unknowingly) but that it was KILLING their server.
Whether this purge processing is a problem at all can be viewed also in that ColdFusion-out.log. Look for a line saying "Run Client Storage Purge", and note the time. It should happen about every 67 minutes (or whatever the CF admin client variable page's purge interval may have been changed to, from its default of 1 hour and 7 minutes). If each purge log line starts exactly 1:07:00 after the previous one, then client variable purging is not a problem for you. But if there is a deviation of any more than 67 minutes and 0 seconds from one purge to the next, then the higher the difference the greater the impact of purging.
Of course, the purging could be in the database, which could cause a negative impact but not high CPU.
But if you are storing client variables in the registry, that's where you WOULD see high CPU (and other performance impact) due to the purging of that, especially so often, and this is indeed a frequent cause of high CPU. (And besides the CF Admin setting for the default client storage, note that your own code can set its own preference for client variable storage with the clientstorage attribute or cfapplication, or the this.clientstorage property in application.cfc.)
Worse, if you are on *nix, and you have CF set to store client variables in the "registry", CF will NOT store client variables in the registry (because of course there is none on *nix), but instead CF will store (and update0 them (all of them) in a single FILE (horrors) called cf.registry, stored in a "registry" folder in your coldfusion install directory. (And in fact, in Railo, even if you set it to use "registry" in the Admin or code, Railo will also itself still store client variables in the file system instead.) All this could be just as bad as storing in the registry, if you are using client variables a lot.
And that's the point I was making above: you may never in your code "use" client variables (by setting a variable like client.name="bob"), but that DOES NOT MATTER. If you enable clientmanagement (in cfapplication or as a property of application.cfc), then you WILL cause CF/Railo to be using (reading and writing) client variables on EVERY page request (unless you use the option to "disable global varaible updates" in your the Admin setting for each repository).
But you can't just go turning that off, nor can you just raise the purge interval, or change the default repository from one value to another, as any of those could impact possible legitimate use of client variables, if any. Sadly, it's a rather complicated subject to understand and resolve (which is why you don't hear about it often), but I have talked about it to a considerable degree here: Suffering CPU, DB, or memory problems in CF? Spiders could be killing you in ways you'd never dream.
And you can see how this topic of client variables is closely tied to the memory problem above, where I said that high memory due to large numbers of sessions can also be due to spiders and bots.
High CPU usage could indeed be due to some rogue code in your app
In my experience, those two above (CF running out of memory, or client variable purge processing) are far more common explanations of high CPU usage in CF than anything else, but we shouldn't rule out the possibility that it's simply due to some rogue CFML code. I really doubt it would ever be about any one tag or function, or the choice between one and another, but of course if you had an infinite loop (by mistake, of course), or perhaps even some tight loop over some large number of items that was then doing some cpu-intensive activity, then we could expect to see high CPU in CF.
So how would you find this happening? With any of the traditional CF monitors, like the CF Enterprise Server Monitor, or FusionReactor or SeeFusion. These could help you readily spot a long-running task as that's one of their primary tasks. So sure, if you saw some one (or a few) long-running requests and CPU was high, you may want to suspect them as possible causes and use one of these tools to investigate. And FusionReactor in particular will show you the CPU thread time used for any request, both while it's running and in its request history details and request log.
And as for finding WHAT line of code may be running at a point in time for a current request is on, that's where the "stack trace" feature (available in all 3 monitors but especially helpful in FR) comes in, whether obtained in the interface or generated automatically as an alert. I've blogged on stack tracing in the past. See CF911: Easier thread dumps and stack traces in CF: how and why, as well as a presentation I did at cfobjective in 2010 and recorded later on the CFMeetup: CF911: Stack Tracing CFML Requests to Solve Problems.
On the other hand, sometimes you may have a problem of high CPU within a particular running request, but caused by some code you might never have suspected...
High CPU usage could be due to CFML image processing
Related to the above, here is something you could do in code with no consideration at all that there could be a CPU impact: I've talked before about performance problems due to the default "interpolation" setting for CFML image processing. Check out Could CF image processing be killing your #ColdFusion server? Explanation and solutions, which also discusses how to address the issue.
High CPU usage could be due to some rogue code from a hacker!
Finally, I'll point out one more possible explanation for high CPU which has started to hit more and more CF servers in recent years/months: it could be that the reason for high CPU is NOT tied to traffic and NOT tied to your own mistaken code, but could be due rather to rogue hacker code that may have been placed on your server.
You may have heard about the bitcoin mining exploit that has hit some CF servers, for instance. When this sort of rogue app gets on your server, you may not see high CPU in CF itself but rather in a process implemented by the hack. The point is that you should definitely keep in mind the possibility that high CPU "on the box" may well be connected to CF even if not showing up as "in" CF.
You'll want to be sure to have applied all needed CF security hotfixes and protections, as discussed in the ColdFusion lockdown guide. For more on the lockdown guide, and other CF security resources, see #ColdFusion Lockdown/Security guides: there are several, and some you may have missed. And for more on applying hotfixes effectively and carefully, see Applying hotfixes to #ColdFusion 9 and earlier? A guide to getting it right.
Hope some of that helps you find and resolve your CF or Railo CPU problems. If not, and you think you could use a hand, again this is the kind of troubleshooting I do with people every day. Let me know if I can help, whether directly (remote, no minimum, satisfaction guaranteed) or perhaps with a quick question here in the blog entry. I welcome feedback as well, including if you think I may have missed another key cause of high CF CPU.
For more content like this:
- If you may prefer direct help, rather than digging around here/elsewhere or via comments, I can help via my consulting services
- See that for more on how I can help a) over the web, safely and securely, b) usually very quickly, c) teaching you as we go, and d) with satisfaction guaranteed