Linq + Regex = pure power

Combining regular expressions with linq gives an extremely fast and powerful high level description of string parsing, that traditionally results in very verbose iteration/condition/assignment code.

Due to the complete lack of strongly typed IEnumerable implementations in the System.Text.RegularExpression namespace, some Cast need to be applied.

Here's an example that returns a unique lowercase list of the first word from a list of camel-cased strings that have multiple words:

static IEnumerable<string> FirstWords(  
    this IEnumerable<string> values)
{
    Regex reg = new Regex(
        @"(?'entity'[A-Za-z\d][\da-z_]+)([A-Za-z\d][\da-z_]+)",
        RegexOptions.Compiled | 
        RegexOptions.CultureInvariant | 
        RegexOptions.ExplicitCapture);
    return values.SelectMany(
        value => reg.Matches(value).Cast<Match>().Select(
           match => match.Groups["entity"].Value.ToLowerInvariant()
        )
    ).Distinct(StringComparer.Ordinal);
}

Sweet eh? Try do that with System.String...

Google
m@kli.dk @klinkby RSS feed  GitHub