Suffering CPU/DB/memory problems in CF? Spiders, monitor pings, and more may be to blame
Note: This blog post is from 2006. Some content may be outdated--though not necessarily. Same with links and subsequent comments from myself or others. Corrections are welcome, in the comments. And I may revise the content as necessary.If you're trying to get to the bottom of high memory or CPU use or database contention on CFML servers, you may be missing a seemingly innocuous but deadly invader (or many of them, really. Many more than you may ever dream.) If you're focusing only on "what are my long running requests?" or wondering "Why does CF have a memory leak?", you may be looking at the wrong problem.
What if the problem isn't really a "leak" at all, nor "poorly performing code", nor "CF being unable to scale", but instead what if it's really all due thousands or millions of page requests, which (worse) are likely creating (unexpectedly ) thousands or millions of sessions and client variables, each day? They may even be something you're causing (but more likely not). It's a pernicious problem that many never even fathom, or may dismiss too readily.
Curious to hear more? Read on.
http://www.bennadel....
... fixed an error that was made in ....
http://www.bennadel....
Anyway, what I basically do is turn off and on session management on a per-page-hit basis (as per Dinowitz's suggestion).
Also, Charlie, are you the carehart that made the first comment on this page: http://livedocs.macr...
Cause if you are, THANK YOU THANK YOU THANK YOU, you have saved me a lot of stress.
http://www.blogoffus...
Why have you switched over to a very short session duration? I assume that this is so that your whole site can assume that session management is being used without complicating the logic of the way things work???
-Ben
Mike makes a great point I did not: the impact of the client variable issue applies more than just if you use registry or db client vars, but cookie-based client vars as well. As he wrote, "It seems that when a client variable is set, a memory structure is also set for CF. Now each bot hit is assumed to be it's own session as it does not accept cookies. This mean each bot hit generates a memory structure of about 1k. Now this is not really a lot, but when you have a few 10's of thousands of hits from bots a day, it adds up. " Mike also offers more remediation techniques.
As for the CFDocs custom tag, yes, that was me, Ben. :-) Glad it helped.
Charlie - Your late to my post, I'm late to someone elses, etc. This is a VERY important topic that many just don't think about and bringing it up every now and again is only to the benefit of everyone.
I had the same issue on a server that was serving RSS via CF. As you correctly say, RSS readers generally don't honour cookies, so every RSS request was creating a new session. Combine that with a session-scoped user object, some session-scoped cached data, and a spider that was crawling the site and all it's RSS links, and you very quickly end up with a site that mysteriously goes down in the middle of the night. I fixed it by
- adding a bit of code ino OnRequestEnd.cfm to check if the user was not logged in, and if not, manually clear the session scope
- writing the RSS to flat files
I put the full sordid story here : "RSS Ate My Server" - http://instantbadger...
Some tips:
1. Implement client side caching using <cfheader> with either etags or Last-Modified.
2. Don't be scared to send a 503 error response to say "server busy try again later"
3. BlueDragon has the <cfthrottle> tag that helps you manage the repeat offenders
I had client variables turned on and in the registry in a CF7 legacy app, but I never used them so I didn't think about it. The above problem manifested in a strange way which made perfect sense in retrospect, but was hard to figure out. The two symptoms I had were:
1) Server restarts took a very long time, over 20 minutes.
2) jrun.exe would peg one of the four CPUs so the machine ran at 25% - but the problem would only occur an hour after a restart, so I kept thinking I had it fixed because problem #1 made me sharry of restarting this production server.
It turned out that I had over half a million garbage registry entries from crawler hits, and every 67 minutes the purge process would begin, but it couldn't actually purge them - either 'cause there were too many or 'cause new ones were created faster than it deleted the old one, or for some other reason - so it just hogged as much CPU time as the OS would give it. Other pages ran, just sluggishly.
I tracked the problem down finally with sysinternals' procmon chasing down the pegged jrun process event and seeing several dozen registry read/write attempts per second.
Turned off client management in the sites, moved the default store (I think to cookies) and spent the next couple hours watching regedit delete all the keys and everything was Hunky Dory...
Wish I'd taken Charley up on his offer to help me troubleshoot this problem waaaay back in that first MURA conference :)
Hope this helps somebody,
8riaN
If you can't solve things on your own with that info, I am available to help under my short-term, remote consulting. More at www.carehart.org/consulting.