[Looking for Charlie's main web site?]

Easily finding cached versions of a site/page when it's down or gone

Have you ever had a web site "go dark" on you? or found that a given page on a site somehow disappeared? Maybe it's only temporary (there may even be a "we're down" message, though the site or server may just fail to respond at all), or maybe the failure of the page or site will be permanent.

The good news is that there are at least two easy ways that you may well still be able to see that content you may be missing: the Google cache (to at least see the last version which Google may have cached), and the internet archive "wayback machine", which often lets you see YEARS back in the history of a page or entire site, including one that may be long-gone.

In this post I share tips (and gotchas) on using both tools.

They aren't GUARANTEED to have the page you're looking for, but I find that they do about 99% of the time I try them (and I use them a lot, because I'm often mining gold in old blog posts or articles which have gone away across many sites I have visited).

[Updated June 9 in a variety of ways, mostly minor, but with some additions in the "trip down memory lane" discussion.)

[....Continue Reading....]

The 100 most interesting posts on the Adobe ColdFusion blog, the past 3 years

The Adobe ColdFusion team blog often has really some interesting content, but I find that some people are either not aware of the blog or just don't keep up on it, or perhaps they have trouble finding something they saw before or maybe heard was there.

So here I present what I feel are the 100 (technically, 105) most interesting/useful posts made there over the past 3 years (2014-16), offering information about CF and CFML which should be valuable to readers for years to come.

[....Continue Reading....]

All My Blog Entries for 2012 (all but 3 about ColdFusion)

Following up my last blog entry (highlighting the top 10 most-viewed entries for my blog this year), here's a listing instead of all the entries I've done this year, if it may help someone more easily review if they missed any that might be interesting.

I present the list in two forms: first, just a list of all the entries (31 of them), and second, broken down by category, in case some category may be more interesting to you.

The entries

Here are the 31 entries, in descending order by date.

[....Continue Reading....]

Most-viewed ColdFusion blog entries of 2012

As the year comes to a close, many bloggers take a moment to document the most-viewed entries of the year on their blog. In that spirit, here are the top-viewed entries of the year for my blog.

I have more to say about the list (and such lists) below, but for those who like to "get to the point", here's the list:

[....Continue Reading....]

Blocking comment spam in BlogCFC (or it could be adapted to others)

Want another tool to help battle blog comment spam? Here's an approach I use that may benefit others. I look for certain bad URLs being referenced in the comment, and if they exist I block the comment. Sure, there are other solutions. I've wondered for a while about sharing this code publicly like this, but I get enough people who've asked for it that I figure I may as well.

Update: Ray has clarified (in a comment) that BlogCFC does already have this functionality, in the "trackback spamlist" feature (on the Settings page of the BlogCFC Admin). I thought that had only to do with track backs, not comments. If you're using BlogCFC, you should use that feature to achieve what I describe here. But some of the thoughts and techniques may still interest some.

What's the problem, for bloggers and commenters, and why Captcha isn't enough

We all know that comment spam is the bane of our existence. How many times have we seen comments referring to wowgold or battery crap or some foreign characters we can't even read. Sure, captchas and other tools are intended to try to stop it. But some still gets by those. These are often real people typing this in, so they get by tools that try to block automated entries. (I appreciate that some tools do still more. Check out the link above to learn more of them.) For those still interested, press on.

These spammers are clever: they'll repeat words from earlier in the blog entry, or from some other commenter, or even from some entirely different blog entry, hoping the blog owner won't notice that a shifty URL has been planted in the text (or the URL field of the comment form), all trying to get a little Google pagerank love for the URL they're pimping.

So I wanted to come up with my own solution that simply detected and blocked any comments with references to those bad urls. What I did works for BlogCFC (admittedly an old edition), but the concept can be of value to you regardless of the blogging software you may use.

And to be clear, this bane of blog comment spam is not just an annoyance for bloggers themselves, but also any who are blog commenters. Most blog software is setup to send us commenters a copy of any other comment someone posts. Even if a blogger is diligent about catching and deleting such comments (so they get no pagerank love from being posted), some of the damage is done in that the fellow commenters on that entry did get the email.

My solution

Again, I wanted a solution that let me detect and prevent submissions of spammy URL references. There's no blacklist for keywords in the version of BlogCFC I have.

Even then, I realize some don't like doing blacklists of keywords anyway, since you can get false positives. Then there's the challenge that if you look for some words, the spammers just change them. But for the problem above, their goal is to get their URL listed.

So I was interested in looking only for URLs, not just any "words". Further, I want to check in both the content field and the URL field of the comment. (And if it meant I blocked someone who was merely mentioning one of these spammy URLs, in a helpful way, I'm willing to risk that false positive.)

My approach

So the way I do it is that I created a file to track the bad urls. When I get a comment that's got content that's spam, I put any domains it refers to into that file (and then delete the comment, of course).

Then before accepting any new comment, my code reads that file (yes, on each comment submission. I could optimize things, of course, reading the file for a cached period. I could also offer an interface to more easily add URLs to the badurl list file. I just haven't gotten to that. For now, I just edit it, maybe a few times a month after having gotten most of the common crap URLs under control.)

About the blacklisted urls file

Rather than post the badurls.txt file here, you can leave me a comment (which will ask for your email address which is not shared and your URL. Tell me the URL of your CF blog), and I'll send it to you directly. Don't want to give away intel to the spammers, plus by me sending it along you'll get the latest.

Another thing I could do is create a service where the badurls file is kept and accessed/updated centrally. Again, just haven't gone to that yet. Nor even creating a Riaforge project for this. I'll wait to see what people think.

The badurls file is really just one big long list (comma-separated) of bad domains. Here's just a sampel of the first few entries (it's all just on one line):

dedikodulu.net,acrobatajans.com,dosyapaylasim.net

Note that I don't bother using the full url, and I even leave off the www. part, since some spammers use sobdomains. Of course, I wouldn't add to the list a domain that looked like it could be legit. But if it looks suspicious, it's black listed.

What do the spammers see?

I don't tell the spammers that I'm rejecting them because of the spammy URL. I just report "Invalid request" as an error. I also happen to email myself when people attempt to send comments (in case they have problems with the captcha or for some other reason their comment doesn't make it), so I have fun watching how the spammers flail about trying again and again to get their crap in. :-)

I figure if it was a false positive and someone REALLY sincerely felt that their comment should be let in, despite their referring to one of these urls and getting rejected, they could just contact me directly (as I offer a contact link on my blog, or they may think to enter a plain comment. Again, these are rare instances, I think.) The benefit for cutting down on spam comments has far outweighed the risk.

Update: With regard to the BlogCFC "trackback spamlist" feature, I'll note that it doesn't offer any feedback at all if a comment has a blacklisted keyword/url. It just closes the form as if it took, but the comment is not posted.

What do I do with the badurls file? Show me some code.

I drop the badurls.txt file into the blog root directory (typically blog/client in blogcfc), in the same directory with the addcomment.cfm template. In that file, I make just the following 3 edits to that addcomment.cfm template.

First, I add the following that reads the file in:

<cftry>
   <!--- ought to cache this and refresh when file changes --->
   <cffile action="READ" file="#expandpath("badurls.txt")#" variable="badurllist">
   <!--- the next line is just to test if the data in the file is in fact a valid CF list. if not, email me --->
   <cfset listerrcnt = listlen(badurllist)>
   <cfcatch>
   <cfmail to="whoever" from="whoever" subject="failure during blog addcomment, badurl list processing"><cfdump var="#cfcatch#"></cfmail>
   </cfcatch>
</cftry>

And in the addcomment.cfm test I place some more code for adding a comment which should go inside this line:

<cfif isDefined("form.addcomment") and entry.allowcomments>)

and after the first IF test for:

<!--- tests to block spammers --->
).

I added this:

<cfif findlist(badurllist,trim(GetHostFromURL(form.website))) or findList(badurllist,form.comments)>
      <cfset errorStr = errorStr & "- " & "Invalid request" & "<br>">
   </cfif>

Sure, I could have done that in CFSCRIPT. Same with the next chunk coming up. Feel free to change it if that suits you. :-)

Needed (and created) a new UDF, FindList

You'll notice this calls a udf, findlist, which does something that surprisingly no built-in function does: searching one string for any of several items in a list. (For an explanation of how it differs from listfind and listcontains, see the version posted at CFLib. That udf is a little more complicated, as I expanded it based on some feedback from others.)

<cffunction name="findList">
   <!--- FindList, from Charlie Arehart--->
   <cfargument name="valuelist" required="Yes" type="string">
   <cfargument name="stringtocompare" required="Yes" type="string">
   <cfset var found=0>
   <cfloop list="#arguments.valuelist#" index="x">
      <cfif findnocase(x,arguments.stringtocompare)>
         <cfset found=1>
      </cfif>
   </cfloop>
   <cfreturn found>
</cffunction>

Hope all that may help someone. Feel free to comment.

If a tree falls in a forest and comments are disabled, how will you know? (COMMENTS WORKING AGAIN)

Oy, computers. Have you by any chance tried to leave me a comment here on my blog? Or tried to do a search of the blog? Sadly, neither would have worked. In the case of the search, it just never limited the results to entries with the search text. In the case of blog comments, it just kept showing the form and ignoring what you typed. I'm really sorry for any who may have been entering comments all this time, to no avail.

When I noticed I was getting no more comments, I knew something was up, and it turns out both were caused by the same thing. The issue was that a couple of weeks ago I switched my site to a new server (more on that later), and in the midst of the move I made a slight configuration mistake.

The bummer is that no one ever wrote met to tell me that their comments weren't taking. I offer my email address in a pod on the right of all pages. :-( Oh well. Problem solved now.

Bloggers: validate your feed on new entries, or you and your readers could suffer. Here's how

Have you ever found (as a blogger or as a reader of a blog's feed) that sometimes it seems the feed just seems to stop working? It could be that it's become invalid. Here's a tip.

Bloggers: you really should validate your feed on every new submission. You never know when some special character you used (or copy/pasted from elsewhere) might make your feed invalid.

In this entry, I propose a couple of possible solutions, either that you may find or that you can easily add to your own blog.

Validating your feed. Does your blogging tool do it?

I imagine some blogging tools may even offer this as a feature. It's easy enough to validate one's feed with a tool like http://feedvalidator.org/. You can easily validate your own by adding your URL with http://feedvalidator.org/check.cgi?url=yoururl. (Technically, the value of yoururl should be URLEncoded.)

Validating it yourself on each new entry

If your blogging tool doesn't do that validation for you, here's another thought: you could easily do it yourself. Here's code I use to check the feed whenever a new entry is made. It looks at the validation result and sends me an email if it fails, which has saved my bacon a couple of times:

<cfhttp url="http://feedvalidator.org/check.cgi?url=#urlencodedformat("http://carehart.org/blog/client/rss.cfm?mode=full")#" resolveurl="Yes">

<div id="content">

<cfif cfhttp.filecontent does not contain "congratulations">
   <cfmail to="myemailaddress" from="myemailaddress" subject="Your Blog's RSS feed has failed" type="HTML">
      <p><a href="http://feedvalidator.org/check.cgi?url=#urlencodedformat("http://carehart.org/blog/client/rss.cfm?mode=full")#">http://feedvalidator.org/check.cgi?url=#urlencodedformat("http://carehart.org/blog/client/rss.cfm?mode=full")#</a></p>
#cfhttp.filecontent#
   </cfmail>
   <cfoutput>
      <h4>Validation Failed</h4>
      For http://carehart.org/blog/client/rss.cfm?mode=full. Email sent to myemailaddress.
      <p><a href="http://feedvalidator.org/check.cgi?url=#urlencodedformat("http://carehart.org/blog/client/rss.cfm?mode=full")#">http://feedvalidator.org/check.cgi?url=#urlencodedformat("http://carehart.org/blog/client/rss.cfm?mode=full")#</a></p>
   </cfoutput>
<cfelse>
   <cfoutput>
      <h4>Validation passed.</h4>
      For http://carehart.org/blog/client/rss.cfm?mode=full passed.
   </cfoutput>
</cfif>
</div>

Of course, you could just drop that code into your blogging code if you're comfortable doing that.

Using your blog tool's Ping feature

But if you don't want to edit your blogging code, you could do this just as easily with your blogging tool's ping feature, if it offers one. These are more typically used to provide one or more URLs which the blogging tool will call when you offer a new entry, such as to notify blog aggregators of your new entry (rather than waiting for them to come back to find your entry eventually).

You could use that same feature have it go to a URL on your own site that runs the code above. That's what I do.

Is there a service doing this already?

I suppose someone could set up a service to do this, letting you pass in the URL and email addresses. For now, I'm not in a position to do that on my own server for others. One would need to be careful not to let this be abused in any way. I also imagine it could get used by a lot of folks.

I kind of wonder why some free service hasn't yet been created to do this. Surely someone could find a way to monetize it. :-)

Anyone know of such a tool?

Anyway, there's the idea and the code above, if it may help.

PS: This is more than just for bloggers

BTW, this applies to more than just blogs. Anything where you add items that offer an RSS feed to read them, this would make sense, such as podcasts, news items, and more.

In fact, I've been meaning to write this entry for a long time, and was actually motivated when I came across some failing OPML for one of the CF blog aggregators today. I dropped a note to the owner, letting him know that someone had slipped in a bad character when they'd entered a new feed to him. I suggested he could benefit from this idea (as would others), and that I'd blog about it. There you have it.

CF Bloggers of the world, unite: come join a new google group to work together

Are you a CF blogger? I had a thought recently: we could probably share a lot with each other to make the most of our blogging efforts to the CF community.

For instance, Pete Freitag had a really neat tip today on optimizing your RSS feed to keep better manage traffic.

Regardless of which blogging software we use, we still serve the same CF community and just as we've all learned from and contribute to that community, we probably also know some useful tips and resources to share with each other about our blogging efforts.

So I created a new google group today, http://groups.google.com/group/cfbloggers, and I'm inviting all CF bloggers to join.

Sure, there are resources out there for all bloggers (like www.performancing.com), but I wanted to create something just for our community, without the traffic of the noisy broader world of bloggers, as well as to perhaps focus on issues of interest just to us, such as how we can organize our info better (perhaps together) for the CF community, how to monetize our blogs (maybe create a CF community-specific ad banner mechanism), how to solve common blogging-related problems (like aggregation and feed validation), and so on.

NOT another place to discuss CF

I don't mean for this to be yet another place to discuss CF questions and issues with each other. We all already have plenty of places to do that.

Joining the Group

For now, I've made it a private group. People must either be invited or request to join, and the discussions are not made public. I think that may be best to permit people to speak frankly. We can discuss if it should be made publicly viewable, but I'd propose that joining always be moderated to keep out folks who are not truly CF bloggers.

  • If you have a google account already, just login and join, which you can do in one step
  • If you don't want to create a google account (needed only if you want to access the web interface), you can also drop me a note via the group join interface and I'll just add you to the list and you'll start getting mail whenever folks send a note to the list.

Of course, you don't need a gmail address or account to join. Any address will do.

Who was pre-invited to join?

I was torn between not inviting anyone (and hoping it would spread by word of mouth, emails, and blogging) and seeding the list with at least several folks to start. I decided the latter was a better choice, at the risk of offending anyone I might leave out. To avoid an incredible effort to think of/find all CF bloggers, I instead just used the list of names of bloggers in the "last 72 hours" display on FullAsAGoog's CF category, at the time I created the blog.

If I had their email address or could find it quickly, I used it. And while typing their names into an outgoing email, if my email client showed me other folks with the same first name, etc, I added them to the list.

I'm sure I've left off many. No offense intended at all. That's why I'm doing this blog entry (but I realize that won't reach most of the CF bloggers, so feel free to spread the word.)

If you know anyone else who wants to join, have them use the links above.

Should be low-volume, which is good

I suspect this will be a low-volume list, except for spurts that surround interesting discussions, so I'd hope it would not add a burden to your inbox, but rather would be valuable even if you only got one new idea a month.

Hope you'll join at least for a while.

Do you blog? Do you identify yourself on your blog? Please do!

I'm so surprised by how many blogs I come across where the blogger has not identified themselves in any way: no name, no bio, no email link. I suppose some may do it intentionally, as some form of anonymity (and I do realize why some may not want to list their email), but I honestly think most just had't thought about whether to list their name or anything more.

I'd like to put out a plea to at least consider listing your name, either in your title ("clever name - by blogger name"), or just in some text below it, or in your toolbar. Better still would be a small bio, or a link to a page that has one. (Maybe it would help if blog software offered an "about" pod that made you think of it more readily.) A photo would be nice, too. And for reasons (and with cautions) I propose below, I recommend you also list your email address.

Why bother with name, bio, and/or email? Because it's in your interest!

There are a couple of reasons to consider it, and they help both you and your readers.

First, as for listing at least your name, a good reason is simply to associate yourself with all the value you create by your blog. Why not get credit for your work? Plus, many would really like to know who you are. (And if your blog software puts a tiny "by" under each blog entry, I'll argue that's not enough. I've missed that myself on more than one site.) Again, whether in the title, below it, or in the toolbar, just put it somewhere! :-)

As for a bio, again, even just a couple sentences about yourself (below the title or in the toolbar) can really personalize the blog. Don't assume everyone knows your background, even if they know you by name. Many readers will appreciate knowing more about where you work, where you're from, etc. Such details can also lend perspective to what you write about. (For instance, if you're a fan or a foe of something where that would color all of your posts, it can be helpful for people to realize, "oh, he works for them|on that open source project|with that tool| etc.)." Let people know where you're coming from.) But at least consider offering some background, even a single sentence.

Finally, as for your email address, someone may want to contact you to offer feedback that's not specific to a post. They may want to offer you work (and not want to announce that in a blog comment)--and even then, which post should they enter such a generic note to you in, anyway? Keep in mind that not all readers realize that you get notified of all comments by email, so they may give up trying to contact you.

Heck, they may even have trouble posting a comment, and therefore need *some* way to contact you. I've certainly seen that before.

But isn't it bad to post your email address online?

OK, I realize you may not want to offer your email, as spambots will capture it. But you've probably noticed more and more people listing their addresses as "name (at) domain". The thinking is that people can figure that out, but spambots (at least the dummer ones) will not. I'll grant that they'll eventually catch on. You just need to way how important the benefits are against the pain of more spam. (You do have a spam catching program, I hope? I love the one I use, Cloudmark Desktop. No, it's not free, but there are certainly many of them you can check out.)

Be careful using that (at) trick with Mailto links
If you do decide to use the (at) approach, but you also offer a mailto link, like:

be careful: you need to list the "anti-spammer" address in the mailto (used to launch the email) as well as between the a tags (as shown to the user). Spambots grab all the text on your page, not just what's "visible". This is a pain, because then in the email that's opened the user must notice that you've done this and change it, or the mail will fail to get to you. What I do is explain to the user that by forcing some body text into the mail that's opened. Did you know that was possible?

<a href="mailto:charlie (at) carehart.org?body=please change the spam-fighting email address format I filled in for you, replacing the (at)!">charlie (at) carehart.org</a>

And for those who maybe already knew about it, did you know that you could also use:

<a href="http://tipicalcharlie.blog-city.com/forcing_a_line_break_in_an_html_email_link.htm">force a line break within such content in an HTML email link</a>
(this is from another blog of mine, typicalcharlie.com, which is for generic, non-CF tips)

So please, bloggers, step up and identify yourself. We'll all appreciate it!

Copyright ©2017 Charlie Arehart
Carehart Logo
BlogCFC was created by Raymond Camden. This blog is running version 5.005.
(Want to validate the html in this page?)

Managed Hosting Services provided by
Managed Dedicated Hosting