[Disclaimer : All benchmarks were done in debug mode on an i7 920 system with no compiler optimizations]

After browsing some cool C++ util functions today on StackOverflow, one of the cool util functions was for trimming strings, so I became intrigued and decided to compare them against my own, as they used some cool C++ std library tricks that I've never seen before. After doing benchmark tests, I was excited to find my own implementation appeared to blow his out of the water. After 1,000,000 tests, his averaged about 27 microseconds, while mine at first averaged less than 5 microseconds. After tweaking mine a bit, my trim_trailingWhitespace function averaged close to 4 microseconds after averaging 1,000,000 samples. Also to note, his function works directly on the original string, while mine returns a new copy; so it's even more impressive.

Here's another benchmark using 1,000,000 samples.
Using the following string: "          "
His: 37.88 microseconds
Mine: 4.03 microseconds

Benchmark 3 using the same # of samples as before.
Using the following string: "  The world ends with you!   "
His: 30.37 microseconds
Mine: 6.26 microseconds

The reason for this post, is because I was excited to find out my implementation is extremely fast, and I felt like sharing it with the world to see if it's possible to do even better! I'm still thinking of ways to improve my functions, but please let me know if you can best me in any way, I'm here to learn!


The only header you should need is the following, with the proper using statements too of course:

#include <iostream>

using std::basic_string;

This is the trim function used by my trim_trailingWhitespace

template<typename CharType>
static basic_string<CharType> trim(const basic_string<CharType>& str, size_t index, size_t len)
{
	size_t length = str.size();
	if(index >= length)
	{
		return basic_string<CharType>();
	}

	length -= index;
	length = length > len ? len : length;

	return str.substr(index, length);
}

And this is my function to trim the whitespace at the end of the string:

template<typename CharType>
static basic_string<CharType> trim_trailingWhitespace(const basic_string<CharType>& str)
{
	size_t length = str.size();
	if(length == 0)
	{
		return str;
	}

	register size_t index = length - 1;

	while(' ' == str[index])
	{
		if(index != 0)
		{
			--index;
		}
		else
		{
			return basic_string<CharType>();
		}
	}

	return trim(str, 0, ++index);
}

You'll notice I'm using some funky template stuff here, and basic_string; this will allow you to use whatever kind of string encoding you want, 1Byte ASCII characters (string), or 2Byte unicode characters (wstring), or whatever. And the beauty of it is, if you pass in the proper string type, like string or wstring, you don't need to define the template parameters; your compiler will automatically fill them in for you so you only just need to call the function name without worrying about the template details!

I haven't gotten a chance to rewrite my trim_leadingWhitespace function, so that will be coming soon; and yes, I will rename these once I decide upon a better name.


After doing some further research I stumbled across this implementation.  I realized my above comparison to the first algorithm wasn't entirely fair as the implementation wasn't even competitive, even though it worked in-place.  So, here's the source for an actually efficient implementation of rear trimming white space:

void rtrim(string& str, const locale& loc)
{
  string::size_type pos = str.size();
  while (pos > 0 && isspace(str[pos - 1], loc)) pos--;
  str.erase(pos);
}

By replacing the isspace with ' ' == you can make this function much faster, though obviously this works with our locale, and may be different across other languages. Another possible reason this isn't as fast as it could be is because of the str[pos - 1], he's doing an extra calculation per loop besides the pos--. His solution gets around the problem of using an unsigned int (size_t) that mine originally had, but lucky for me my rewrite made mine even more efficient because of this trick he had to do.

And here's the details on isspace. I think my check works fine for both string and wstring and the like; if not, please let me know.  [Edit] You could pass the locale space/seperator in as a parameter to the function if that's an issue.

As the above solution works in place, I decided to tweak my original function to work in place on a string, and am excited to say that it beats out this obviously improved implementation.  Here's my new in-place implementation:

template<typename CharType>
static void trim_trailingWhitespace(basic_string<CharType>& str)
{
	size_t length = str.size();
	if(length == 0)
	{
		return;
	}

	register size_t index = length - 1;

	while(' ' == str[index])
	{
		if(index != 0)
		{
			--index;
		}
		else
		{
			str.clear();
			return;
		}
	}
	str.erase(++index);
}

So, my implementation provides shortcuts for an empty string, and for the case the entire string contains spaces (where you see str.clear()). It also provides an interesting way to loop through the string compared to the other implementation, allowing mine to be a bit quicker, probably because I only have to use str[index] (besides the above mentioned shortcuts).

Benchmark 1:
String: "      "
His: 2.97 microseconds
His + my tweak: 2.68 microseconds
Mine:2.54 microseconds

Benchmark 2:
String: "  The world ends with you!   "
His: 3.93 microseconds
His + my tweak: 3.68 microseconds
Mine: 3.65 microseconds

As noted at the header, the above tests were done in debug mode with no optimizations.  If you turn on optimizations you'll notice the two algorithms at the bottom here are neck and neck, it doesn't really matter what you use.  Heck, if you look at the algorithm from StackOverflow that's not nearly as efficient in debug, suddenly becomes nearly comparable.  Though, it's still nice to have a code that runs fast in debug too; but it does raise the question whether it was worth the time to do this research; as they say, don't optimize until you have to.  It's been a nice educational endeavor so far, so I don't really care if it might be considered wasted time, as I've learned more than a thing or two.  Thoughts?

And here's the link to the StackOverflow page of useful C++ utilities: Most Useful C++ Utils.

Here I sit, among fellow Game Design and Development (GDD) students in the GDD Computer Lab, my last quarter at Rochester Institute of Technology. There are so many things I want to do, and only so much time to do it; there's so much research I would love to do, but I'd have to sacrifice too much of my personal life, besides I'm not being paid to do it yet anyway. So, I've come to further realize, I think I would love to be paid to do research in the Video Game programming, related to graphics, user tools, etc. etc. There are A LOT of things I could do with user tools alone, especially when it comes to graphics, and it's something I'll be doing a bit of during this quarter so I can get a platform-independent Gadget System (UI and Application system) working!

I'm actually a little sad I haven't posted to my blog more, as there's much I want to share with the world, and I would love it for people to see I'm bursting with ideas and helpful thoughts.  I have knowledge I want to share with the world, to help make it easier for people like me to get started and break into developing video games!  So, I'll have to start tackling this more.  I have a Matrix Tutorial/post I plan on releasing, it'll be about using them in games, not necessarily implementation; as from a programmers perspective, you don't need to know about the implementation to use it, as long as you understand it.  There is soo much math behind all that goes into video games, and it would take a long time to get a grasp and derive every single thing that goes behind it all.

Right now I'm working on some utility functions, setting up a INI file for loading, saving, and editing INI files.  From there, I plan on setting up a Settings class where you can setup automatic loading from the INI file; I was tossing around the idea of being able to monitor an INI file in realtime so if you make a change to it your settings will load them up, which in response will throw events and re-apply the settings to your current game.  This seemed like a powerful debugging feature, though obviously should be a configurable option.

I think I may make a post on tools ideas, or at least make a post on the gadget system I'll be working on.  One of the tools I thought would be neat would be to use a networked mouse or keyboard, which in reality is more useful for consoles, but I thought still could be a powerful feature.  I know I can't wait to add networking features to my engine, I want to be able to use FTP or something like it for streaming files, either for transferring levels mid-game for user content, distributed processing, or what have you.  I also thought it would be neat to pull in files from the web as well, though I'm not yet sure if it's entirely necessary yet; granted, I haven't thought about this in a while.  I pretty much plan on using gadgets for managing everything that I can in the engine, to make it as easy as possible to work with, so you can tweak things in realtime with the utmost control.

Archive