Sorting in C#

Sometimes we all yearn for the days of DOS sort.exe, or MVS JCL's DFSORT. Sorts were big in the 2-3GL world of COBOL and the like as indexed files were not a given, and sequential file processing was the rule. Batch processing typically ran 'jobs' in steps, and one of those steps was inevitably a sort. Downstream process logic had precise dependencies on input data sort integrity and big-iron OS's all came with their own SORT utilities. The beauty of the sort step was it's clarity: you knew exactly what you wanted, and it was pretty easy to get it.

Although sorts have dropped from their prime position in the pantheon of programming processes, they're still around. Uncommon, but when required, utterly unavoidable.

Almost all programming languages make provision for sorting, and C#'s no exception.

C# sorting comes in a few flavours. The simplest is the sort method that pops up implemented in all sort of useful .NET library set objects like Lists. Of course, this sort of solution is rarely practical given the restriction of having only one sort operand, and we more often end up having to delve into the world of interfaces and OO approaches involving object derivation from useful .NET library components.

I recently had need to do a slightly-complicated sort on multiple keys, handling strings, numbers and datetimes. Here's a quick guide for this kind of situation, with a couple of learnt-the-hard way snippets.

1) Create a custom sort function.
This function is tasked with loading your data into a List (or some other aggregating .NET object type), defining and using your new custom sort IComparer object on this list and lastly, outputting your newly sorted List object into the output file. Running the sort itself involves calling the .Sort method on your list object, passing in the custom comparer object as the parameter.

2) Create a custom IComparer object
The magic (if you can call it that!) happens here.
IComparer-like objects simply compare two bits of data and return a 1 if the first is 'bigger', -1 if it's 'smaller' and 0 if they're the same. It's up to you to implement the logic behind bigger and smaller and in the process you can analyse the records to sort to process date times, strings and numbers (or even other objects). The comparison logic resides in the Compare method you create in this object - it's compulsory.

Two things are important to remember to get this object working correctly:

a) ALL comparison logic paths must be tested. The compiler picks up the fact there are missing paths for you if you forget some, but I guess there's only so much it can do. To abuse a consulting world term, your testing must be 'MECE' : mutually exclusive, completely exhaustive.

b) You should explicitly test for NULL against all the bits of the records you want to compare.

You can see that if you want to compare multiple parts of an object/record, you're going to end up with lots of if's and else's...especially given you need to check NULLs as well. The code can get quite ugly!

The result of your hard work implementing custom sort functionality is quite impressive speed-wise. A few hundred thousand record file with 3 comparison operands runs in 5 or so seconds on my average-fast PC. This is your reward for obeying the .NET paradigm - a somewhat verbose but very well-specified and slightly abstract piece of code works efficiently!

Here's a slightly schematic chunk of code explaining the above.


public static void SortMyFile() {
// Read the whole file into a List
List myLines = new List();
using (StreamReader r = new StreamReader("the file.txt")) {
string theLine;
while ((theLine = r.ReadLine()) != null)
{
myLines.Add(line);
}

// Create the sort object and sort
MyVeryOwnComparer myVeryOwnComparer = new MyVeryOwnComparer();
// Call the List.Sort method with the sorter as parameter.
myLines.Sort(myVeryOwnComparer);

// write out your sorted list
// using StreamWriter or the like
.
.
}

// Your iComparer derived comparing object.
// Create the tests in the Compare method you must implement.
// Don't forget all paths to be tested and test for NULL explicitly.
public class MyVeryOwnComparer : IComparer
{
public int Compare(string x, string y)
{
// break up your x and y strings,
// convert elements as required and return
// 1's, -1's or 0's
.
.
.
}
}


Comments

Popular Posts