Referer Spammers

The Internets are full of spam. Maybe you’ve noticed?

It’s in your inbox, in your comments and scattered throughout your web forums. Every spammer is a bag of dicks but the worst bottom feeder on the Internets is the referer spammer.

If you’ve never administered a website, then you’ve probably never heard of referer spam. Yeah, what is that?  Glad you asked. These dregs send requests to your web site with a fabricated referer that points to a site they want to advertise. Ideally, they’ll send requests to a site that publishes its traffic reports. When their URL makes the report, they get a free link back to their site.

Sites that publish their usage reports are easy to find. Put this in the Google Machine and see what pops up: “Top * Total Search Strings” This is what we’re looking for: Usage Stats: Top Referers.  Your JoeDog can get himself on that report by doing this:

Bully $ siege -H "Referer: http://www.joedog.org/" -g http://www.pickart.at/
HEAD / HTTP/1.0
Host: www.pickart.at
Accept: */*
User-Agent: Mozilla/5.0 (unknown-x86_64-linux-gnu) Siege/3.0.8
Referer: http://www.joedog.org/
Connection: close
HTTP/1.1 200 OK
Date: Fri, 03 Oct 2014 17:53:38 GMT
Server: Apache
Connection: close
Content-Type: text/html

Now if he’s really intent on making that report, he’ll repeat that request a few hundred times and place himself at number two on the chart. But here’s the thing: Referer Spammers will spam your logs even if you don’t publish your reports. They’ll go to all that trouble just to lure webmasters to their esoteric fetish sites.

So what can you do to prevent this stuff? Mostly you can decrease their incentive.

  1. Put your usage stats inside a password protected area
  2. Add a robots.txt with a bot exclusion rule so search engines don’t index it.
  3. Add a nofollow directive inside every link, again so engines don’t index them

I guarantee you’ll still get the stuff. They’ll send faked referrals just to capture the attention of the site’s administrators but at least you won’t award them with a boost to their Page Rank.

NOTE: Yes, Your JoeDog spelled Referrer with only two r’s. Most humans use three. Phillip Hallam-Baker is not most humans. He was the first guy to miss an ‘r’ in the original HTTP specification. I say, “first guy” because hundreds of eyeballs viewed that document and none of them noticed the misspelling. By the time it became RFC1945, “Referer” was set in stone. It would have been easier to change the world’s English-language dictionaries at that point….

Posted in Apache, Applications, Security | Leave a comment



So Are You Vulnerable To Shell-shock?

Here’s a quick command line test to see if you’re vulnerable to shell-shock, the bash vulnerability that everyone — I mean everyone — is talking about:

$ env x='() { :;}; echo 1. env' bash -c "echo 2. bash"

If your bash is vulnerable, it will execute the echo command inside the environment, if it’s not vulnerable, then it will only execute the stuff after -c

A vulnerable system prints this:

$ env x='() { :;}; echo 1. env' bash -c "echo 2. bash"
1. env
2. bash

A non-vulnerable system prints this:

$ env x='() { :;}; echo 1. env' bash -c "echo 2. bash"
2. bash

On the vulnerable system, the echo command that is set in the environment is executed by bash when the shell is invoked:

env x='() { :;}; echo 1. env' bash -c "echo 2. bash"

The stuff in red should NOT be executed. That’s a bug; it needs to be fixed.

NOTE: The second command was run on the server that hosts this blog entry. You guys can quit trying, mmmkay?

 

Posted in Applications, Security, sh | Leave a comment



Rear Recovery Onto Different Hardware

Your JoeDog still likes rear.

He uses it for bare metal recovery and system cloning. Recently he had to clone one server onto older hardware as part of a disaster recovery exercise. It was problematic.

Problem one: The rear recovery disk could not connect to the network.

This system had bonded NICs and Your JoeDog started to suspect they were causing an issue. When the recovery disk booted, he brought down all the network interfaces and tried to assign a new address to the server. The routing table looked fine. The eth0 config looked fine, but the network was unreachable.

Acting on a hunch that bonded NICs were giving him fits, Your JoeDog did a recursive grep of the rear directory …

… wait a minute, what’s a recursive grep?
You can do it like this:

$ find /usr/share/rear -print | xargs egrep -i bond

Cool, thanks …

Anyway, as a result of that search, he found this feature: SIMPLIFY_BONDING With a little more digging, he discovered that it takes ‘y’ or ‘n’ so Your JoeDog set it to y and re-archived the server. He added that directive to local.conf

SIMPLIFY_BONDING=y

When the server booted from the new recovery disk, the only network interface was eth0. Your JoeDog reset that address with ifconfig and he was able to clone the server from his rear archive. SUCCESS!!!!

Problem two: No success! After the rear recovery, the kernel panic’d and the server wouldn’t boot. Unhappy sad time. 

Your JoeDog was all, “Hmmm I’ll bet I need to rebuild the kernel for new hardware….”

So he restored again from rear. This time, when the recovery was complete, he chroot’d the mount point and rebuilt the kernel.

… wait a minute! How do you do that?
Glad you asked. Here’s my command history:

$ chroot /mnt/local
$ export PATH=/sbin:/bin:/usr/sbin:/usr/bin
$ cd /boot
$ mkinitrd -f -v initrd-2.6.32-431.20.3.el6.x86_64kdump.img \
                 2.6.32-431.20.3.el6.x86_64

NOTE: Whatever you call the kernel, i.e., whatever you use for the second argument of mkinitrd, make sure you have a directory by the same name in /lib/modules, i.e., /lib/modules/2.6.32-431.20.3.el6.x86_64

DOUBLE NOTE: Once you’re inside /boot, do an ls to find available kernel images. They’ll begin with initrd- and end in .img

Now get yourself some rear.

 

Posted in Applications, Rear | Leave a comment



Shellshocked

Wired provides an interesting angle on the bash shell bug that has all your panties in a bind

[Brian] Fox drove those tapes to California and went back to work on Bash, other engineers started using the software and even helped build it. And as UNIX gave rise to GNU and Linux—the OS that drives so much of the modern internet—Bash found its way onto tens of thousands of machines. But somewhere along the way, in about 1992, one engineer typed a bug into the code. Last week, more then twenty years later, security researchers finally noticed this flaw in Fox’s ancient program. They called it Shellshock, and they warned it could allow hackers to wreak havoc on the modern internet.

[Wired: The Internet Is Broken]

 

Posted in Applications, Programming, sh | Leave a comment



Is Hardware Outpacing Software Or Is It The Other Way Around?

Here’s an interesting experiment.

After hearing two strong players argue that the only real progress in chess engines in the last ten years was due to faster computers a special match was played to challenge this idea. Komodo 8 ran on a smartphone while a top engine of 2006 used a modern i7 computer that runs 50 times faster. This is the difference between Usain Bolt and the Concorde. Guess what happened?

 

 

Posted in Applications | Leave a comment



Fido 1.1.3

Your JoeDog had a requirements change. “Stupid requirements!” He had to ensure each file in a directory and all its sub-directories was less than eight days old. Unfortunately, Your Fido didn’t traverse directory trees. He stood watch only at the top of the tree.

That’s the problem with dogs: they have a mind of their own.

Without much effort, fido learned a new trick. It now recursively searches a directory for files. To leverage this feature, you’ll have to give it a command. “Recurse, boy, recurse!”

/export {
 rules = exceeds 7 days
 exclude = ^\.|CVS|Makefile
 action = /usr/local/bin/sendtrap.sh
 recurse = true
}

recurse takes one of two values, true or false. True means search the tree and false means remain at the top level. If you don’t set a recurse directive, then fido will treat it as false, i.e., it will remain in the top directory.

[Trending: Fido-1.1.3]

 

Posted in Applications, Fido, Release | Tagged | Leave a comment



Linux Bare Metal Recovery With Rear

Your JoeDog loves rear! And who doesn’t, amirite?

Except it’s not that rear. It’s an acronym for Relax and Recover, a Linux bare metal recovery tool.

Your JoeDog has been using Mondo for cloning systems. It’s good software that served him well despite difficulties moving from one hardware set to another. If Your JoeDog archived sd disks and recovered to cciss, then he was knee deep in i-want-my-lvm hell.

Rear makes those type of migrations much easier. If you archive a server using one type of disk driver and recover it to one that requires another, rear reworks the disk layout for you. It’s also configured to ignore external disks. If you archive a server connected to a SAN, rear simply ignores those multipath devices.

Like Mondo, you can archive and recover from an NFS server. Here’s a suggested configuration for NFS archiving. Place these directives inside /etc/rear/local.conf

OUTPUT=ISO
BACKUP=NETFS
NETFS_URL=nfs://10.37.72.44/export
NETFS_OPTIONS=rw,nolock,noatime
OUTPUT_URL=file:///export

To archive the system, run ‘rear -v mkbackup’

This configuration creates an ISO image called ‘rear-hostname.iso’ inside 10.37.72.44/export/hostname. To recover the server, burn that ISO onto a CD and boot the system with it. Select the Recover option then run ‘rear recover’ at the command prompt.

“It’s that simple,” Your JoeDog said with the zeal of a recent convert. He’ll be back to bitch about rear in a couple weeks but for now it’s nothing but love….

Posted in Applications, Rear | Leave a comment



Is A Port Number Required in the HTTP Host Header?

Well? Is it?

How’s this for a definitive answer: “Yes and no.”

We find the answer in RFC 2616 section 14.23:

The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI: “Host” “:” host [ ":" port ]

A “host” without any trailing port information implies the default port for the service requested (e.g., “80” for an HTTP URL).

So if an HTTPS request is made to a non-standard port, say 29043, then you should send a port even though the RFC doesn’t compel you to. And if you make HTTP or HTTPS requests to standard ports, then it’s probably best to omit the port string.

The above is my interpretation. I’ve maintained an HTTP client for thirteen years and this has been a point of contention. In the course of all that time, I’ve added and dropped :port from the header. Like Jason in a hockey mask, it keeps coming back. In its latest iteration, siege implements the interpretation you see above. If the port is non-standard, it appends :port to the string. If it is standard, then it simply sends the host.

Look for this feature in siege-3.0.8

 

 

 

 

Posted in Applications, Siege | Leave a comment



Siege 3.0.7 Release

Here’s the format for a location header,  Location: absolute_url

Unfortunately, many developers don’t care about standards and Internet Exploder is famous for letting them get away with it. When siege followed the letter of the law, I was inundated with bug reports that weren’t bugs at all. If siege is confused by Location: /haha that’s your developer’s problem, not mine. Against my better judgement and beginning with siege-3.0.6, I started constructing absolute_urls from relative paths. Unfortunately, my parser missed a usecase: localhost. Siege 3.0.6 will barf on this:

Location: http://localhost/haha_or_whatever

Technically, I didn’t miss localhost. If you look at url.c:459 you’ll see this:

// XXX: do I really need to test for localhost?

It didn’t occur to me that people would run siege on the same server as their webserver.  My bad. There are many tests besides load tests.

All siege users running version 3.0.6 should upgrade to siege-3.0.7.tar.gz

Posted in Siege | Leave a comment



It Knows Me Better Than I Know Myself….

robotI write a lot of software with which I interact. If it’s easy for me, then it’s easy for you. I try to keep it easy for me. JoeDog’s Pinochle is the first program against which I’ve competed. It’s been a surreal experience.

The program was designed to be competitive against me. Tonight it took two out of three games. The damn thing knows me inside and out. And why not? I wrote it. And while I can exploit some knowledge of its inter-workings, I can’t predict all its behavior. It was designed to learn bidding from experience.

Bidding is the hardest aspect of this game. The team that wins the bid has an incredible opportunity to earn a lot of points. At the same time, overbids come at a large price. A failure to make the bid means the bid is deducted from your score.

When the game was first released, its bids were implemented programmatically. I like to think I’m a pretty good programmer but that version of the game played like a moran. To improve it, I had the game play itself hundreds of thousands of times. It would store those results and use them to generate future bids.

This implementation has resulted in a much more competitive program. Now it bids more aggressively — much more aggressively. It bids like me which is odd because I didn’t tell it to do that. I told it to learn from its experience and as a result of that experience, its personality morphed into mine.

 

Posted in Applications, Java, Pinochle, Programming | Leave a comment



Recent Comments

  • Jeff Fulmer: I love that there’s an end bracket surround by a sea of comments. That really aids its...
  • Tim: The more I read that code – the more wtf it becomes. Its a work of beauty that you appreciate the longer...
  • Mike: I just now saw you know DK effect already. Whoops.
  • Mike: “he lacks the skill to recognize his ineptitude” I believe it’s recognized as the...
  • Mirko: Wow! This trick saved my day :) thanks a lot