Is A Port Number Required in the HTTP Host Header?

Well? Is it?

How’s this for a definitive answer: “Yes and no.”

We find the answer in RFC 2616 section 14.23:

The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI: “Host” “:” host [ “:” port ]

A “host” without any trailing port information implies the default port for the service requested (e.g., “80” for an HTTP URL).

So if an HTTPS request is made to a non-standard port, say 29043, then you should send a port even though the RFC doesn’t compel you to. And if you make HTTP or HTTPS requests to standard ports, then it’s probably best to omit the port string.

The above is my interpretation. I’ve maintained an HTTP client for thirteen years and this has been a point of contention. In the course of all that time, I’ve added and dropped :port from the header. Like Jason in a hockey mask, it keeps coming back. In its latest iteration, siege implements the interpretation you see above. If the port is non-standard, it appends :port to the string. If it is standard, then it simply sends the host.

Look for this feature in siege-3.0.8

 

 

 

 

Posted in Applications, Siege | Leave a comment



Siege 3.0.7 Release

Here’s the format for a location header,  Location: absolute_url

Unfortunately, many developers don’t care about standards and Internet Exploder is famous for letting them get away with it. When siege followed the letter of the law, I was inundated with bug reports that weren’t bugs at all. If siege is confused by Location: /haha that’s your developer’s problem, not mine. Against my better judgement and beginning with siege-3.0.6, I started constructing absolute_urls from relative paths. Unfortunately, my parser missed a usecase: localhost. Siege 3.0.6 will barf on this:

Location: http://localhost/haha_or_whatever

Technically, I didn’t miss localhost. If you look at url.c:459 you’ll see this:

// XXX: do I really need to test for localhost?

It didn’t occur to me that people would run siege on the same server as their webserver.  My bad. There are many tests besides load tests.

All siege users running version 3.0.6 should upgrade to siege-3.0.7.tar.gz

Posted in Siege | Leave a comment



An Old Dog Learns A New Trick

Beginning with version 3.0.6-beta2, siege reacts differently to –reps=once.

In the past, when you invoked –reps=once, each siege user would invoke each URL in the file exactly one time. If urls.txt contained 100 files and you ran -c10 –reps=once, siege would finish its business with 1000 hits.

That was then.

This is now: siege runs each URL in the file exactly once. If you run -c10 –reps=once, then siege will split the file among all 10 users and hit each URL one time. Whereas in the past, you’d finish with 1000 hits, you now finish with 100 hits.

This should give you greater control by making tests more precise.

Posted in Applications, Siege | Leave a comment



Siege 3.0.4 Becomes Part of the Problem

Siege 3.0.4 was just released. It contains a feature that I’ve added with a certain amount of reluctance. To understand the feature and the reason for my trepidation, let’s visit RFC 2616 and read what it has to say about Location headers:

For 3xx responses, the location SHOULD indicate the server's
preferred URI for automatic redirection to the resource. The 
field value consists of a single absolute URI.
    Location = "Location" ":" absoluteURI
An example is:
    Location: http://www.w3.org/pub/WWW/People.html

That’s pretty clear, right? The value of a location header must be an absolute URI. Yet a large number of developers ignore that directive. Here’s the response from a server running SquirrelMail, a popular web-based email program:

     HTTP/1.1 302 Found
     Date: Tue, 17 Sep 2013 16:50:52 GMT
     Server: CERN/1.0A
     X-Powered-By: PHP/5.2.5
     Location: src/login.php
     Content-Length: 0
     Connection: close
     Content-Type: text/html; charset=WINDOWS-1251

Although that Location header violates RFC 2616, nearly every web client will follow it to SquirrelMail’s intended destination. I say “nearly every client.” Until version 3.0.4, siege wouldn’t have followed it any where. It would have scratched its head and said, “Fsck it. Next URL.”

It is with some reluctance that I’ve included siege in the community of clients that allow developers to circumvent established standards. This convention has created a slew of bad coding practices on the world wide web. Didn’t close a table with an end tag? That’s okay, M$ will close it for you. Used a relative URI in a Location header? Don’t worry, siege will normalize it for you.

Ironically, version 3.0.4 includes one other feature enhancement. Its default User-agent is now in full compliance with RFC 2616. You win some, you lose some. And so it goes….

Posted in Applications, Siege | 5 Comments



Siege 3.0.3 and URL Encoding

URL encodingURL encoding aka URL escaping aka percent encoding is a mechanism for converting URL characters into a format that can be transmitted by HTTP. Reserved characters are replaced by a hexadecimal value preceded with a ‘%’ which is an escape character. If a URL contains a space, for example, it must be encoded for transmission. Your browser takes a space and reformats it as %20.

Siege, on the other hand, does nothing. It expects you to encode your own damn URLs … that is, until now! Percent encoding is available in siege starting with version 3.0.3-beta2. When it emerges from beta, the first stable version to support this feature will be 3.0.3.

Really? Siege has been around since 1999 and you’re only now adding this feature?

Well, you guys never asked and I haven’t had much need for it. Lately, however, I’ve noticed many of you are asking about json. I suspect URL escaping will be helpful to those folks. Consider this:

siege -g ‘http://www.joedog.org/siege/echo.php?q={ “Hello” : “world” }’

 GET /siege/echo.php?q=%7B%20%22Hello%22%20:%20%22world%22%20%7D HTTP/1.0
 Host: www.joedog.org
 Accept: */*
 User-Agent: JoeDog/1.00 [en] (X11; I; Siege 3.0.3-beta2)
 Connection: close

Booya! Just make sure you single quote the URL like in the example above.

Since URL escaping is in its early stages, I’ve provided a mechanism for disabling it. Inside $HOME/.siegerc add the following: url-escaping = false The default value is true.

Anything else in version 3.0.3-beta2 that we should know about?

Why yes! We changed behavior for -g/–get. When you retrieve a page using -g/–get, siege sets its protocol to HTTP/1.0 so the page is human readable. We don’t need to read chunked encodings and neither do you.

 

H/T: Your JoeDog would like to give a shout out to the folks at wget from whom he completely stole most of the code necessary to implement URL escaping. Cheers.

 

Posted in Applications, Siege | 1 Comment



Siege and the Single Cookie

An emailer wondered if siege could be configured to refuse cookies. I had a vague recollection on the matter. My brain cells were all, “Yeah, sure, you can disable cookies.”

The best place to look for more esoteric features is inside $HOME/.siegerc If you don’t have one, you can generate a new one with this command: siege.config That command will place a new one inside your home directory. ‘siege.config’ is designed to build a resource file which is compatible with your version of the program.

I checked the latest version of that file and I found nothing to disable cookies. Really? That seems like a no-brainer. Why wouldn’t we allow users to disable that? “Sorry,” I replied. “Siege can’t disable cookies.”

Yet my own response was bothersome. I was certain you could turn that off so I checked the code with ‘egrep cooki *.c’ Sure enough, siege parses siegerc for ‘cookie’ which accepts true or false. The latter disables cookie support.

Documentation for this feature has been added to siege-3.0.1-beta4. The feature itself is available in just about every contemporary version of siege that’s floating around the Internets. Just add ‘cookies = false’ to your siegerc file to disable their support.

Posted in Applications, Siege | 2 Comments



We’re Number One!

nixCraft selected siege as the Number One Greatest Open Source Terminal Application of 2012. We’re not sure what that means other than AWESOME! It’s been a long road to 2012 greatness. The project began in late 1999 and was released into the wild in early 2000. Since then, it has been developed shaped by countless people from all corners of the globe.

Even though siege is a very esoteric program, its source code is downloaded around 50,000 times a year. Binaries are distributed by most major Linux vendors.  One day  a co-worker wanted to run a test from a new server so he asked if I could get him some binaries. Before I could react to the request, he said, “Nevermind.” He ran ‘apt-get install siege’ and got it from the vendor. That was a pretty cool moment for reasons other than the fact that it saved me some work.

Let’s be clear. This isn’t my recognition. It’s our recognition. For twelve years siege has been a community project and its direction will continue to be shaped by the people who use it. “Now let’s eat a god damn snack.” — Rex Ryan to the siege community.

Posted in Applications, Siege | 1 Comment



Concurrency and the Single Siege

We’re frequently asked about concurrency. When a siege is finished, one of its characteristics is “Concurrency” which is described with a decimal number. This stat is known to make eyebrows furl. People want to know, “What the hell does that mean?”

In computer science, concurrency is a trait of systems that handle two or more simultaneous processes. Those processes may be executed by multiple cores, processors or threads. From siege’s perspective, they may even be handled by separate nodes in a server cluster.

When the run is over, we try to infer how many processes, on average, were executed simultaneously the web server. The calculation is simple: total transactions divided by elapsed time. If we did 100 transactions in 10 seconds, then our concurrency was 10.00.

Bigger is not always better

Generally, web servers are prized for their ability to handle simultaneous connections. Maybe your benchmark run was 100 transactions in 10 seconds. Then you tuned your server and your final run was 100 transactions in five seconds. That is good. Concurrency rose as the elapsed time fell.

But sometimes high concurrency is a trait of a poorly functioning website. The longer it takes to process a transaction, the more likely they are to queue.  When the queue swells, concurrency rises. The reasons for this rise can vary. An obvious cause is load.  If a server has more connections than thread handlers, requests are going to queue. Another is competence – poorly written apps can take longer to complete then well-written ones.

We can illustrate this point with an obvious example. I ran siege against a two-node clustered website. My concurrency was 6.97. Then I took a node away and ran the same run against the same page. My concurrency rose to 18.33. At the same time, my elapsed time was extended 65%.

Sweeping conclusions

Concurrency must be evaluated in context. If it rises while the elapsed time falls, then that’s a Good Thing™. But if rises while the elapsed time increases, then Not So Much™. When you reach the point where concurrency rises and elapsed time is extended, then it might be time to consider more capacity.

 

Posted in Applications, Siege | 3 Comments



HTTP Authentication

Some of you seem to confuse Basic authentication with form-based authentication. They’re not the same and the differences are important. If you don’t configure siege for the appropriate authentication method, it will be on the outside looking in at an HTTP-401.

Basic authentication occurs at the protocol level. It was originally described in HTTP/1.0 and later moved to RFC 2617. Basic authentication is a challenge/response framework. When the server receives a request for a protected resource, it challenges the user to authenticate himself. It will make the item available only after the user is autheticated.

Here’s an example exchange using basic.php from the html directory inside the siege source code:

GET /siege/basic.php HTTP/1.0
Host: http://www.joedog.org
Accept: */*
Accept-Encoding: gzip
User-Agent: JoeDog/1.00 [en] (X11; I; Siege 2.71b6)
Connection: close
HTTP/1.1 401 Authorization Required
Date: Thu, 16 Feb 2012 13:09:53 GMT
Server: CERN/1.0A
X-Powered-By: PHP/5.2.5
WWW-Authenticate: Basic realm="siege_basic_auth"
Status: 401 Unauthorized
Content-Length: 178
Connection: close
Content-Type: text/html; charset=WINDOWS-1251
GET /siege/basic.php HTTP/1.0
Host: http://www.joedog.org
Authorization: Basic c2llZ2U6aGFoYQ==
Accept: */*
Accept-Encoding: gzip
User-Agent: JoeDog/1.00 [en] (X11; I; Siege 2.71b6)
Connection: close
HTTP/1.1 200 OK
Date: Thu, 16 Feb 2012 13:09:53 GMT
Server: CERN/1.0A
X-Powered-By: PHP/5.2.5
Content-Length: 278
Connection: close
Content-Type: text/html; charset=WINDOWS-1251

See what happened? Siege requested /siege/basic.php and the server was all “Whoa! I don’t know who you are.” It issued an HTTP 401 challenge to siege which responded by sending its username and password in BASE64 encryption: c2llZ2U6aGFoYQ==

In this example, I emulated HTTP Basic authentication with a php program. Typically, Basic auth is setup at the server level. Here’s an example in apache:

<Location "/siege">
   AuthType basic
   AuthName "siege_basic_auth"
   AuthBasicProvider file
   AuthUserFile /var/www/etc/passwd
   AuthGroupFile /var/www/etc/group
   Require valid-user
   Require group siege
   Satisfy All
 </Location>

To configure siege to use basic authetication, you need to add a login to your .siegerc file. Search the file for WWW-Authenticate. The directive is login and it takes three values separated by a colon. username:password:realm. Our basic.php username and password are ‘siege’ and ‘haha’. So our login looks like this:

login = siege:haha:siege_basic_auth

The third argument (realm) is optional. If you don’t specify a realm, siege will send ‘siege:haha’ every time it faces an HTTP basic challenge. By setting a realm, you can configure it to use multiple logins:

login = admin:secret:Administration
login = siege:haha:siege_basic_auth
login = root:d41ly:high_level

Now you can also restrict access programmatically. This is referred to as form-based authentication. In order to configure siege to login in this manner, you’ll need to reproduce a browser’s action.

To illustrate this, we’ve included login.php in the html directory of the siege source code. That page accepts both GET and POST requests. It produced an HTML form that looks like this:

<td>Username: </td><td>
<input type='text' name='username' value='' size='30'></td>
<td>Password: </td><td>
<input type='password' name='password' value='' size='30'></td>

To login to this form, you’ll need to provide field values that match the form. Your parameters must match the form input names. In this case it’s ‘username’ and ‘password’.

http://my.server.com/login.php?username=siege&password=haha
http://my.server.com/login.php POST username=siege&password=haha

If your entire site requires authentication you can add a login URL to your .siegerc file. If this value is set, siege will access that URL before it does anything. Search your .siegerc file for ‘login-url’. Here’s an example using one of the URLs we constructed above:

login-url = http://my.server.com/login.php POST username=siege&password=haha

After it hits that URL, siege will start running through the list of URLs you created.

Happy hacking.

Posted in Applications, Siege | 8 Comments



Recent Comments

  • roshni: Hi jeff, I need your help regarding running urls in a file containing post directives. Could you please send...
  • Alle: In seige, what does the pink result mean?
  • Windows User: Nice collection of Perl modules. Thanks for sharing.
  • Jeff Fulmer: No idea. What do you see in the webserver’s logs?
  • roro: Hi Jeff, I’m running siege in cygwin 32 bit, with SSL. When hitting an https server with this call,...