Fork me on Github
Fork me on Github

Joe Dog Software

Proudly serving the Internets since 1999

Siege 3.1.0 Release Candidate 1

We received messages about core dumps that occur when siege is run with more than 700 threads. That’s a lot of threads, you guys! First things first — and this can’t be stressed enough — if you run siege with -c700, make sure your web server is configured with a pool of 700 threads. If you hit a pool of 256 threads with 700 users, all you’re gonna do is make a mess, mmmkay?

Now back to those core dumps. Here’s the thing: we’ve been unable to reproduce the problem but some diligent siegers have worked with us to get stack traces. As a result, we’ve added improved error handling and integrity checks in the problematic region. Your JoeDog would greatly appreciate if you could test version 3.1.0 release candidate 1. You can provide feedback here or by email, whichever you prefer.

http://download.joedog.org/siege/beta/siege-3.1.0-rc1.tar.gz



Millions of Cookies, Cookies For Me

cookiemonsterA couple weeks ago, a reader notified Your JoeDog of a problem with siege’s cookie handling. When the server sets a cookie that’s already stored, siege won’t update its expiration time. A JoeDog Fellow told him, “That’s a paddlin’.” The last cookie in, is the next cookie out.

The investigation took Your JoeDog into cookie.c, a file he hasn’t touched in years. “This code is awful,” he said. “Who wrote this shit? … Hmm, some guy named Jeff Fulmer, et al. If I ever meet that guy, I’ll tell him a thing or two.”

As an ethnic German with a tendency to over engineer — well, um — everything,  Your JoeDog decided to completely overhaul cookie handling. Why change two lines when you can rewrite everything, amirite? The file cookie.c now contains a COOKIE object with getters and setters, cookies.c is a cookie list object which houses all cookies in memory. Cookies will persist in $HOME/.cookies but the details are still being considered.

Here’s the thing. Siege isn’t a browser, it’s many browsers. We store cookies at the thread level. Each thread gets its own list of cookies. To distinguish them, we used pthread_self(), which is a long long int like 140018384393984 or something. Technically it’s not a long long int, it’s a pthread_t, but you get the point. The next time you fire up siege, its threads will have different IDs so we need a convenient way to associate stored cookies with new thread IDs.

Furthermore, there’s no guarantee that your next run will have the same number of simulated users. We need to consider how to handle that situation. If you have 50 stored cookies and you run 60 new users, should we stop assigning cookies at 50 or do we start repeating them starting with cookie 1 to user 51? Inquiring minds want to know. I want to know.

The new code is not yet in version control. If you want a copy, let us know. We can create a beta branch or send you a tar ball. Happy hacking.

NOTE: Your JoeDog refers to cookie and cookies as objects because they are. Yes, siege is written in C rather than C++ but you can write objects in C just fine thankyouverymuch. An object is simply a thing which references itself. To achieve this in C, you just pass a reference to the object as an argument.



Hey! There’s No configure Script!!1!1!!!1!

Your JoeDog gets this a lot lately: “The INSTALL file says run ./configure but there’s no configure file.” It took a second but it’s now clear what’s happening. You guys are grabbing the code directly from github.com

When Your JoeDog redesigned this site, he put that snazzy “Fork me on Github” banner up in the corner. Exciting! So now you guys are forking me! Here’s the thing. The stuff on github is THE source code, it’s not a source distribution. You can find the source distribution on the downloads site.

So what’s the difference? The source distribution contains helper scripts which are generated by autotools. You know, like that configure script. Since configure is built from other files, it’s not a source file and we don’t maintain it in version control.

That doesn’t mean you can’t fork me on github! If you take that route, you’ll need to build your own configure script. We’ll learn how to do that after the jump….

Continue reading Hey! There’s No configure Script!!1!1!!!1!



Siege 3.0.9

What’s Your JoeDog doing now? He’s knee-deep in old C code. This code generates software that calculates the optimum way to cut sheets of linoleum as they roll off a production line. Aren’t you glad you asked? How old is this code? It was last updated in 1999 when it was ported to HP-UX.

You know how an old song can take you back — sometimes to a good place, sometimes to hell? Old code works like that. This project was coded by other humans, but Your JoeDog sees his own flaws in it. He sees techniques that remind him to hang himself back in 1999.

Nobody codes like that anymore. There’s a reason why we’ve abandoned some techniques in favor of others. For the past two weeks, Your JoeDog has been dereferencing variables, debugging memory leaks and trying to figure out what’s whacking his stack. Context is everything, people. In this one, you don’t want anything whacking your stack.

Now siege already encapsulates much of his current programming philosophy. It’s written in C but it relies on object-oriented architecture. If you encapsulate memory management it makes it easier to pinpoint flaws.

Unfortunately, his personal projects haven’t kept up with industry standards. This coding experience has prompted him to fix his sins before they become unmanageable. Your JoeDog updated to gcc-4.7.4 and he watched the warnings fly! This version fixes all of those warnings. There’s nothing sexy about it but you should probably upgrade anyway.

[JoeDog: Siege-3.0.9]

 

 



CTR Is Hard

Sproxy is a word Your JoeDog invented to describe his [S]iege [Proxy]. At the time of this writing, this site has the top three positions for ‘sproxy’ on Google. In the past week, nine hundred people typed ‘sproxy’ into the Google machine. Of those nine hundred, only 110 clicked a link to this site. That’s a 12.22% click-through rate for a made-up word that describes an esoteric piece of software that exists right on this very site. Let’s just say that falls a little below expectation….

 

 

 



Is A Port Number Required in the HTTP Host Header?

Well? Is it?

How’s this for a definitive answer: “Yes and no.”

We find the answer in RFC 2616 section 14.23:

The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI: “Host” “:” host [ “:” port ]

A “host” without any trailing port information implies the default port for the service requested (e.g., “80” for an HTTP URL).

So if an HTTPS request is made to a non-standard port, say 29043, then you should send a port even though the RFC doesn’t compel you to. And if you make HTTP or HTTPS requests to standard ports, then it’s probably best to omit the port string.

The above is my interpretation. I’ve maintained an HTTP client for thirteen years and this has been a point of contention. In the course of all that time, I’ve added and dropped :port from the header. Like Jason in a hockey mask, it keeps coming back. In its latest iteration, siege implements the interpretation you see above. If the port is non-standard, it appends :port to the string. If it is standard, then it simply sends the host.

Look for this feature in siege-3.0.8

 

 

 

 



Siege 3.0.7 Release

Here’s the format for a location header,  Location: absolute_url

Unfortunately, many developers don’t care about standards and Internet Exploder is famous for letting them get away with it. When siege followed the letter of the law, I was inundated with bug reports that weren’t bugs at all. If siege is confused by Location: /haha that’s your developer’s problem, not mine. Against my better judgement and beginning with siege-3.0.6, I started constructing absolute_urls from relative paths. Unfortunately, my parser missed a usecase: localhost. Siege 3.0.6 will barf on this:

Location: http://localhost/haha_or_whatever

Technically, I didn’t miss localhost. If you look at url.c:459 you’ll see this:

// XXX: do I really need to test for localhost?

It didn’t occur to me that people would run siege on the same server as their webserver.  My bad. There are many tests besides load tests.

All siege users running version 3.0.6 should upgrade to siege-3.0.7.tar.gz



An Old Dog Learns A New Trick

Beginning with version 3.0.6-beta2, siege reacts differently to –reps=once.

In the past, when you invoked –reps=once, each siege user would invoke each URL in the file exactly one time. If urls.txt contained 100 files and you ran -c10 –reps=once, siege would finish its business with 1000 hits.

That was then.

This is now: siege runs each URL in the file exactly once. If you run -c10 –reps=once, then siege will split the file among all 10 users and hit each URL one time. Whereas in the past, you’d finish with 1000 hits, you now finish with 100 hits.

This should give you greater control by making tests more precise.



Siege 3.0.4 Becomes Part of the Problem

Siege 3.0.4 was just released. It contains a feature that I’ve added with a certain amount of reluctance. To understand the feature and the reason for my trepidation, let’s visit RFC 2616 and read what it has to say about Location headers:

For 3xx responses, the location SHOULD indicate the server's
preferred URI for automatic redirection to the resource. The 
field value consists of a single absolute URI.
    Location = "Location" ":" absoluteURI
An example is:
    Location: http://www.w3.org/pub/WWW/People.html

That’s pretty clear, right? The value of a location header must be an absolute URI. Yet a large number of developers ignore that directive. Here’s the response from a server running SquirrelMail, a popular web-based email program:

     HTTP/1.1 302 Found
     Date: Tue, 17 Sep 2013 16:50:52 GMT
     Server: CERN/1.0A
     X-Powered-By: PHP/5.2.5
     Location: src/login.php
     Content-Length: 0
     Connection: close
     Content-Type: text/html; charset=WINDOWS-1251

Although that Location header violates RFC 2616, nearly every web client will follow it to SquirrelMail’s intended destination. I say “nearly every client.” Until version 3.0.4, siege wouldn’t have followed it any where. It would have scratched its head and said, “Fsck it. Next URL.”

It is with some reluctance that I’ve included siege in the community of clients that allow developers to circumvent established standards. This convention has created a slew of bad coding practices on the world wide web. Didn’t close a table with an end tag? That’s okay, M$ will close it for you. Used a relative URI in a Location header? Don’t worry, siege will normalize it for you.

Ironically, version 3.0.4 includes one other feature enhancement. Its default User-agent is now in full compliance with RFC 2616. You win some, you lose some. And so it goes….



Siege 3.0.3 and URL Encoding

URL encodingURL encoding aka URL escaping aka percent encoding is a mechanism for converting URL characters into a format that can be transmitted by HTTP. Reserved characters are replaced by a hexadecimal value preceded with a ‘%’ which is an escape character. If a URL contains a space, for example, it must be encoded for transmission. Your browser takes a space and reformats it as %20.

Siege, on the other hand, does nothing. It expects you to encode your own damn URLs … that is, until now! Percent encoding is available in siege starting with version 3.0.3-beta2. When it emerges from beta, the first stable version to support this feature will be 3.0.3.

Really? Siege has been around since 1999 and you’re only now adding this feature?

Well, you guys never asked and I haven’t had much need for it. Lately, however, I’ve noticed many of you are asking about json. I suspect URL escaping will be helpful to those folks. Consider this:

siege -g ‘http://www.joedog.org/siege/echo.php?q={ “Hello” : “world” }’

 GET /siege/echo.php?q=%7B%20%22Hello%22%20:%20%22world%22%20%7D HTTP/1.0
 Host: www.joedog.org
 Accept: */*
 User-Agent: JoeDog/1.00 [en] (X11; I; Siege 3.0.3-beta2)
 Connection: close

Booya! Just make sure you single quote the URL like in the example above.

Since URL escaping is in its early stages, I’ve provided a mechanism for disabling it. Inside $HOME/.siegerc add the following: url-escaping = false The default value is true.

Anything else in version 3.0.3-beta2 that we should know about?

Why yes! We changed behavior for -g/–get. When you retrieve a page using -g/–get, siege sets its protocol to HTTP/1.0 so the page is human readable. We don’t need to read chunked encodings and neither do you.

 

H/T: Your JoeDog would like to give a shout out to the folks at wget from whom he completely stole most of the code necessary to implement URL escaping. Cheers.