Siege 3.1.0 Release Candidate 2

Earlier today we told you about a hard limit in siege. Since it relied on select rather than poll, it was unable to open more than 1024 concurrent sockets on most — maybe all! — operating systems. Now Your JoeDog never needs that many sockets but some of you are over 1000. When you scheduled that many, your OS would abort siege. Don’t worry, the Supreme Court is cool with that.

This limitation was the result of a design decision from 1999. Fortunately, a 2001 design decision made it easy to incorporate both mechanisms in the same distribution. The configure script will check to see if you have poll and, really, in 2015 you should have it. If you do, it will use that mechanism to test your sockets.

Nothing changes for you dinosaurs. If you don’t have poll, then you can carry on with your precious select.

So Your JoeDog could use some beta testers. The changes have been submitted to the main branch at github.com and they’re available in this distribution: http://download.joedog.org/siege/beta/siege-3.1.0-rc2.tar.gz

Test it, Doggers, test its little brains out.

 



Select and Timeouts and Poll, Oh My!

Hanging ChadsBack in 1999-ish, Your JoeDog made a decision! Once he makes a decision, he doesn’t look back. “That’s over, Doggers. It’s time to move on.” And move on we did. Since 1999 we’ve been using select to check if our socket descriptor is ready for input. This is an important test because siege pounds your web server until it calls him “Daddy.” If the socket doesn’t get ready, then the request must time out or siege will hang.

We don’t want more hanging chads!

There were two mechanisms we could have used to accomplish this: poll and select. We chose select because it was more flexible and more readily available. At the time, we supported a great many platforms: Solaris, HP-UX, AIX, SCO, BSD, Linux, etc. The downside to select was its capacity. The fd_set couldn’t handle numbers higher than 1024. Well back in 1999, Yahoo! might have been doing 1024 concurrent connections but mere mortals didn’t need that capacity.

Recently we told you that siege crashes when large pools of simultaneous users are created. We finally got to the bottom of that problem and it dates back to that 1999 decision. Siege aborts when socket descriptors larger than 1024 are passed to FD_SET. We will fix this problem but there’s currently no ETA. It will require extensive testing and more of those difficult 1999-ish decisions.

For example, should we just stone-cold switch to poll or should we support both mechanisms? The latter will create a macro soup; does anyone have a stomach for that? Another option is create an array of fd_sets and place them on each thread. That would allow us to continue using select. (Your JoeDog kinda likes this option). And finally we could just say “Fsck it.” If you really have that many users, then you can afford Load Runner.

NOTE: To give you some perspective on traffic, this site is ranked 155,295 in the US. We almost never exceed 50 hits a second. Your JoeDog professionally webmasters a site ranked under 8000 and it almost never exceeds 100 hits a second. You must have a really large audience to generate 1024 hits a second. The vast majority of you should be fine with 255 threads or less.



Siege 3.1.0 RC 1

We received messages about core dumps that occur when siege is run with more than 700 threads. That’s a lot of threads, you guys! First things first — and this can’t be stressed enough — if you run siege with -c700, make sure your web server is configured with a pool of 700 threads. If you hit a pool of 256 threads with 700 users, all you’re gonna do is make a mess, mmmkay?

Now back to those core dumps. Here’s the thing: we’ve been unable to reproduce the problem but some diligent siegers have worked with us to get stack traces. As a result, we’ve added improved error handling and integrity checks in the problematic region. Your JoeDog would greatly appreciate if you could test version 3.1.0 release candidate 1. You can provide feedback here or by email, whichever you prefer.

http://download.joedog.org/siege/beta/siege-3.1.0-rc1.tar.gz



Millions of Cookies, Cookies For Me

cookiemonsterA couple weeks ago, a reader notified Your JoeDog of a problem with siege’s cookie handling. When the server sets a cookie that’s already stored, siege won’t update its expiration time. A JoeDog Fellow told him, “That’s a paddlin’.” The last cookie in, is the next cookie out.

The investigation took Your JoeDog into cookie.c, a file he hasn’t touched in years. “This code is awful,” he said. “Who wrote this shit? … Hmm, some guy named Jeff Fulmer, et al. If I ever meet that guy, I’ll tell him a thing or two.”

As an ethnic German with a tendency to over engineer — well, um — everything,  Your JoeDog decided to completely overhaul cookie handling. Why change two lines when you can rewrite everything, amirite? The file cookie.c now contains a COOKIE object with getters and setters, cookies.c is a cookie list object which houses all cookies in memory. Cookies will persist in $HOME/.cookies but the details are still being considered.

Here’s the thing. Siege isn’t a browser, it’s many browsers. We store cookies at the thread level. Each thread gets its own list of cookies. To distinguish them, we used pthread_self(), which is a long long int like 140018384393984 or something. Technically it’s not a long long int, it’s a pthread_t, but you get the point. The next time you fire up siege, its threads will have different IDs so we need a convenient way to associate stored cookies with new thread IDs.

Furthermore, there’s no guarantee that your next run will have the same number of simulated users. We need to consider how to handle that situation. If you have 50 stored cookies and you run 60 new users, should we stop assigning cookies at 50 or do we start repeating them starting with cookie 1 to user 51? Inquiring minds want to know. I want to know.

The new code is not yet in version control. If you want a copy, let us know. We can create a beta branch or send you a tar ball. Happy hacking.

NOTE: Your JoeDog refers to cookie and cookies as objects because they are. Yes, siege is written in C rather than C++ but you can write objects in C just fine thankyouverymuch. An object is simply a thing which references itself. To achieve this in C, you just pass a reference to the object as an argument.



Hey! There’s No configure Script!!1!1!!!1!

Your JoeDog gets this a lot lately: “The INSTALL file says run ./configure but there’s no configure file.” It took a second but it’s now clear what’s happening. You guys are grabbing the code directly from github.com

When Your JoeDog redesigned this site, he put that snazzy “Fork me on Github” banner up in the corner. Exciting! So now you guys are forking me! Here’s the thing. The stuff on github is THE source code, it’s not a source distribution. You can find the source distribution on the downloads site.

So what’s the difference? The source distribution contains helper scripts which are generated by autotools. You know, like that configure script. Since configure is built from other files, it’s not a source file and we don’t maintain it in version control.

That doesn’t mean you can’t fork me on github! If you take that route, you’ll need to build your own configure script. We’ll learn how to do that after the jump….

Continue reading Hey! There’s No configure Script!!1!1!!!1!



Siege 3.0.9

What’s Your JoeDog doing now? He’s knee-deep in old C code. This code generates software that calculates the optimum way to cut sheets of linoleum as they roll off a production line. Aren’t you glad you asked? How old is this code? It was last updated in 1999 when it was ported to HP-UX.

You know how an old song can take you back — sometimes to a good place, sometimes to hell? Old code works like that. This project was coded by other humans, but Your JoeDog sees his own flaws in it. He sees techniques that remind him to hang himself back in 1999.

Nobody codes like that anymore. There’s a reason why we’ve abandoned some techniques in favor of others. For the past two weeks, Your JoeDog has been dereferencing variables, debugging memory leaks and trying to figure out what’s whacking his stack. Context is everything, people. In this one, you don’t want anything whacking your stack.

Now siege already encapsulates much of his current programming philosophy. It’s written in C but it relies on object-oriented architecture. If you encapsulate memory management it makes it easier to pinpoint flaws.

Unfortunately, his personal projects haven’t kept up with industry standards. This coding experience has prompted him to fix his sins before they become unmanageable. Your JoeDog updated to gcc-4.7.4 and he watched the warnings fly! This version fixes all of those warnings. There’s nothing sexy about it but you should probably upgrade anyway.

[JoeDog: Siege-3.0.9]

 

 



CTR Is Hard

Sproxy is a word Your JoeDog invented to describe his [S]iege [Proxy]. At the time of this writing, this site has the top three positions for ‘sproxy’ on Google. In the past week, nine hundred people typed ‘sproxy’ into the Google machine. Of those nine hundred, only 110 clicked a link to this site. That’s a 12.22% click-through rate for a made-up word that describes an esoteric piece of software that exists right on this very site. Let’s just say that falls a little below expectation….

 

 

 



Is A Port Number Required in the HTTP Host Header?

Well? Is it?

How’s this for a definitive answer: “Yes and no.”

We find the answer in RFC 2616 section 14.23:

The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI: “Host” “:” host [ “:” port ]

A “host” without any trailing port information implies the default port for the service requested (e.g., “80” for an HTTP URL).

So if an HTTPS request is made to a non-standard port, say 29043, then you should send a port even though the RFC doesn’t compel you to. And if you make HTTP or HTTPS requests to standard ports, then it’s probably best to omit the port string.

The above is my interpretation. I’ve maintained an HTTP client for thirteen years and this has been a point of contention. In the course of all that time, I’ve added and dropped :port from the header. Like Jason in a hockey mask, it keeps coming back. In its latest iteration, siege implements the interpretation you see above. If the port is non-standard, it appends :port to the string. If it is standard, then it simply sends the host.

Look for this feature in siege-3.0.8

 

 

 

 



Siege 3.0.7 Release

Here’s the format for a location header,  Location: absolute_url

Unfortunately, many developers don’t care about standards and Internet Exploder is famous for letting them get away with it. When siege followed the letter of the law, I was inundated with bug reports that weren’t bugs at all. If siege is confused by Location: /haha that’s your developer’s problem, not mine. Against my better judgement and beginning with siege-3.0.6, I started constructing absolute_urls from relative paths. Unfortunately, my parser missed a usecase: localhost. Siege 3.0.6 will barf on this:

Location: http://localhost/haha_or_whatever

Technically, I didn’t miss localhost. If you look at url.c:459 you’ll see this:

// XXX: do I really need to test for localhost?

It didn’t occur to me that people would run siege on the same server as their webserver.  My bad. There are many tests besides load tests.

All siege users running version 3.0.6 should upgrade to siege-3.0.7.tar.gz



An Old Dog Learns A New Trick

Beginning with version 3.0.6-beta2, siege reacts differently to –reps=once.

In the past, when you invoked –reps=once, each siege user would invoke each URL in the file exactly one time. If urls.txt contained 100 files and you ran -c10 –reps=once, siege would finish its business with 1000 hits.

That was then.

This is now: siege runs each URL in the file exactly once. If you run -c10 –reps=once, then siege will split the file among all 10 users and hit each URL one time. Whereas in the past, you’d finish with 1000 hits, you now finish with 100 hits.

This should give you greater control by making tests more precise.