Trim Text to a Specified Number of Words

Sometimes, mostly on a website, you want to trim text to a certain number of characters. For instance, when you are listing posts and want them to look uniform. However, you want the end to end at a word, likely ending an ellipsis or some other symbol or image at the end. There are a couple of pieces to this, but let’s look at the code:

public static string TrimWords(string text, int maxLength)
{
	if (text.Length <= maxLength) { return text; }
	text = text.Substring(0, maxLength).Trim();
	var i = text.LastIndexOf(' ');
	if (i != -1)
	{
		var j = i;
		while (j >= 0 && !Regex.IsMatch(text[j] + "", "[a-z0-9]", RegexOptions.IgnoreCase))
		{
			j--;
		}
		if (j > 0) { i = j + 1; }
		text = text.Substring(0, i);
	}
	return text + "...";
}

The method will returned the trimmed string, given an original one and a maximum length in characters. We first, of course, check if the string given is not already under the length given and if it is, we’ll simply return our original string. First thing we want to do after that check is to trim the string on the right to the length we specified. The next line will get us the zero-based index of the last space in our trimmed string and while it doesn’t take into account cases when you exact length string ends with a punctuation character, that can easily be achieved with by checking something like this:

var endsWithPunctuation = text.Length > 0 ? (new[] { ',', '.', '?', '!', ':', ':' }).Contains(text[text.Length - 1]) : false;

Add characters as needed and trim text by one character on the right if it’s true.

In the initial code, we simply look up the last space and then search for the first character that is not a digit or a number. We want to be safe so we check that we actually do have a space (i != -1) and if that’s the case and we have an index, we assign it to j and keep looping until we find a match for a alphanumeric character or a digit.

At the end, we are going to either have reached such a character and thus we would like to truncate our text accordingly, or we have reached the beginning of the text with no matches and thus use what we have.

The only thing left is to return our truncated text back to wherever needs it.

This entry was posted in C# and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *