CF911: Lies, damned lies, and when memory problems not be at all what they seem, Part 1
I hear people raise concerns with memory problems quite often, whether in my CF Server Troubleshooting practice, or just in my participating in many mailing lists. Indeed, addressing this issue more than a few times the past couple of weeks has motivated me to create this, which will be a series of blog entries.
The series parts are expected to be:
- Step 1: Determine if indeed you are getting "outofmemory" errors (this entry)
- Step 2: Realize that OutOfMemory does not necessarily mean "out of heap" (entry to come)
- Step 3: Diagnose why you really are running out of heap (if you are) (entry to come)
- Step 4: Realize that having high memory usage is not necessarily a problem (entry to come)
Common refrains about memory issues
The common complaints about memory issues (and my quick responses, to give you a sense of where I'll be going in this series) are:
- "CF has a memory leak" (to which I'd retort, no, generally, it does not. There's nearly always some other explanation)
- "CF's use of memory is high" (which may not be a problem. If you're looking at memory from the OS perspective, it may not matter as much as heap use within CF)
- "CF's use of heap memory is high" (to which you'd think, "ah, well that's got to be a problem, right?", but no, not necessarily, as I'll explain in part 4)
- "CF is crashing all the time" (well, is it really crashing, or just hanging up and not responding? That's not a good thing, but it's very different from it crashing on its own)
- "CF is running at 100% CPU before it crashes" (this could be related to memory problems, and more a consequence rather than a cause, or it could be entirely unrelated to memory issues)
So what if we really are suffering a problem?
I'm not saying there's never a real problem of really "running out of memory". It's just that often things are not at all what they seem (or what most presume them to be, from my experience helping people), and that's going to be the bulk of what I'll talk about in this series.
But what if your server is really crashing (or simply not responding), and you think/swear/know that it's a memory problem....
What should you do? Increase the heap size? Increase the permspace? Change the GC algorithm?
Sacrifice a chicken?
I'd say, none of them (though if you're in a rural setting, then perhaps cooking and eating the chicken might help settle your blood sugar so you can stay calm). Really, I know that goes against conventional wisdom, which seems always to suggest diving into the JVM settings. I'd say "hold on there, pardner."
Step 1: Determine if indeed you are getting "outofmemory" errors
This is one that surprisingly few people consider when faced with their server crashing or not responding. They go with whatever conveys to them a sense of there being a memory problem, perhaps adding their own experience or what they read, and they start chasing solutions.
I can't tell you how often I hear people lament that they've googled and found all manner of conflicting and confusing recommendations. And it doesn't help at all that they may be running on CF 8 or 9 (with Java 1.6) while reading about a "solution" written in the time of CF 6 or 7, when it ran on Java 1.4. Of course, the writer often won't have thought ahead to clarify that.
Instead, I'm saying, "stop, drop, and roll".
"Stop" the tail-chasing, "drop" into the pertinent logs directory in CF, and "roll" through them looking for an occurrence of "outofmemory".
Look first in the Runtime/JRun logs
Let me be more explicit: the logs you want to look at for these outofmemory errors are NOT the ones you see in the CF Admin Log Files page. Those are the [cf]\logs directory (or are buried deep within an instance on Multiserver).
Instead, look in the [cf]\runtime\logs directory (or [jrun]\logs on Multiserver), especially at the -out logs. If you're on Multiserver, there will be a prefix for each instance name in the log file names, and there are also -event logs there that you might be able to ignore. (If you're on *nix, you do want to also look at the cfserver.log in the main CF logs directory.)
Some refer to these as the runtime logs, or the jrun logs, or perhaps the jvm logs. Whatever you may call them, their location is above and the explanation of their value follows.
Bonus topic: you can increase the max log size
(I will note that the way these -out.log files work, by default, is that they fill at 200k increments. Yep, not 200mb, but 200kb!, and you may blow through dozens of them in a few minutes if things are going nuts. That size is configurable, but not through means you'd normally expect. See the blog entry I recently published, CF911: How to control the size of CF's -out.logs.)
Look next in the hotspot/pid/jvm abort logs
Separately, there are some other potentially important logs that may relate info concerning memory problems: what some call the "pid", "hotspot", or jvm abort logs. The filename is in a form like hs_err_pidnnnn.log, with some number in place of the n's.
These logs are found in a very unexpected place (for logs): in the directory where CF stores the jvm.config. So on Standard/Server deployments, that's [cf]\runtime\bin. For Multiserver, that's [jrun]\bin. Look in there for any .log files. There is one log for each time that the jvm crashes due to certain kinds of problems. It could be a crash in the hotspot compiler, in hotspot compiled code, or in native code.
What matters most, for this post, is that again you look in them for any reference to the phrase "outofmemory". (Of course, you'll want to look at them in the event of a crash for any other message or info, but explaining the parts and value of these "pid" log files is beyond the scope of this entry.)
So, when you have a crash and you suspect it's a memory issue (or if you don't know the cause and want to learn more about what it may have been), you want to look in these two log directories mentioned above. Many never do, and this is part of why they end up chasing their tails, going instead on gut feelings or trying out various alternatives. I say: find the diagnostic info, and act on it.
Searching through the log files, the easy way
But rather than "look" at all these logs in these directories, one at a time, I suggest instead that you search them (I was tempted to say "stop, drop, and mole" above, since you're "ferreting" through the logs, but it just seemed a stretch.)
If you're on *nix, I don't need to give you any more info on how to search the files. Just grep it and rip it. :-)
If you're on Windows, though, you'll perhaps be tempted to use the good ol' built-in Windows Search tool to search the directory. Let me plead, for the sake of all things decent, please use something better. It's just not a completely reliable (nor fast) tool.
I have blogged about a wonderful free alternative, called File Locator Lite. Use that instead. (If you have another tool or editor you favor, that's fine. No need to bring that up here. I recognize those other options in the other blog entry.)
The beauty of FLL is that having installed it (which is fast, itself), if you right-click on the log directory (or any directory) and choose "File Locator Lite" from the menu, you can then just put the string outofmemory in the search box, and it will in a few moments show any files found in the lower left pane. Then, here's the real beauty over other tools, you don't need to double-click the files to open them. You just single-click each file, and in the right pane, it shows any lines that had the string that was searched. Brilliant, and again, a really fast way to find things.
Don't stop at the last outofmemory error before a crash
This feature of the File Locator Lite tool (to see all the lines in the file with that string) is especially useful in this case, because when searching for outofmemory errors, you also want to be able to quickly see the time for *all* the error messages you may find.
And you *do not* want to focus solely on the last error prior to the crash (or the slowdown, that made you want to restart CF).
Once you find one (or more) preceding the time of the crash, you want to look for any occurring prior to it. It may be that the problem started several minutes before the crash (or your restarting CF). Further, it may be that the outofmemory error just prior to the crash is different from the one that started things out.
Step 1 down, 3 more to go
OK, that's step 1 in determining whether memory problems are really at all what they seem. As I mentioned at the outset, the planned parts in the series are:
- Step 1: Determine if indeed you are getting "outofmemory" errors (this entry)
- Step 2: Realize that OutOfMemory does not necessarily mean "out of heap" (entry to come)
- Step 3: Diagnose why you really are running out of heap (if you are) (entry to come)
- Step 4: Realize that having high memory usage is not necessarily a problem (entry to come)
After I publish them, I'll update the lists here to link to them.
As always, I look forward to your feedback (pro, con, or indifferent).


