A river of T 09 Aug 2010


Generic collections, iterators and Linq all have in common a single interface: IEnumerable<T>. If you’ve been writing C# code lately, I bet you wrote your fair share of methods taking or returning IEnumerables.

Boo (among other marvels) has a cute syntactic shorthand for it. You can write T* instead of IEnumerable[of T]. Of course people would really like to have that in C# as well. Sure T* would be nice, but it’s already used for pointers. Some other suggestions in the stackoverflow question involve using the # suffix, or {}. On my side, I quite like the idea of re-using ~. It’s only used by the bitwise complement operator in C# (and by finalizers, thanks Scott for the heads up), and if you look at it with a bit of poetry, it kind of look like a river, a stream. And what’s an IEnumerable<T> but a stream of T?

Now discussing wishes for the next version of C# is nice and all, but I felt like hacking a bit on mcs, Mono’s C# compiler tonight, and in less than half an hour of grepping through the code, I had a working version of a patch that would enable this syntactic sugar.

Let’s take a very poorly implemented set of Linq operators:

using System;
using System.Collections.Generic;

static class Enumerable {

	public static IEnumerable<T> Concat<T> (this IEnumerable<T> source, IEnumerable<T> other)
	{
		foreach (var item in source)
			yield return item;

		foreach (var oitem in other)
			yield return oitem;
	}

	public static IEnumerable<TResult> Select<TSource, TResult> (this IEnumerable<TSource> source, Func<TSource, TResult> selector)
	{
		foreach (var item in source)
			yield return selector (item);
	}

	public static IEnumerable<TResult> SelectMany<TSource, TResult> (this IEnumerable<TSource> source, Func<TSource, IEnumerable<TResult>> selector)
	{
		foreach (var item in source)
			foreach (var sub_item in selector (item))
				yield return sub_item;
	}

	public static IEnumerable<T> Where<T> (this IEnumerable<T> source, Func<T, bool> predicate)
	{
		foreach (var item in source)
			if (predicate (item))
				yield return item;
	}
}

With my patch, you could write instead:

using System;

static class Enumerable {

	public static T~ Concat<T> (this T~ source, T~ other)
	{
		foreach (var item in source)
			yield return item;

		foreach (var oitem in other)
			yield return oitem;
	}

	public static TResult~ Select<TSource, TResult> (this TSource~ source, Func<TSource, TResult> selector)
	{
		foreach (var item in source)
			yield return selector (item);
	}

	public static TResult~ SelectMany<TSource, TResult> (this TSource~ source, Func<TSource, TResult~> selector)
	{
		foreach (var item in source)
			foreach (var sub_item in selector (item))
				yield return sub_item;
	}

	public static T~ Where<T> (this T~ source, Func<T, bool> predicate)
	{
		foreach (var item in source)
			if (predicate (item))
				yield return item;
	}
}

I let the reader decides for himself which version is the most readable. It's pretty much the same debate as the one for nullable types. Also note that this is just a toy patch, mainly for the sake of digging a bit in mcs, but it’s an interesting experiment nevertheless.