Implementing your own LINQ & IEnumerable <T>
The project I'm working on has really huge collections (elements 1M-1B), and things basically change as collections.
This is a real-time application, so performance is paramount.
For some operations, such as Reverse, BinarySearch (maybe?), Etc., they will suffer more than others, such as Select, etc.
Is it possible to implement one native IEnumerable with possible MoveNext, MovePrev, etc. and own implemented LINQ extensions that have advantages?
If this happens, it will happen at the end of the project. Because we need to start it first and then make it faster.
All in all, this should not be too much work, right?
It is very possible to create your own Enumerable implementation, which may have special situations in some situations. Basically, you would like to define your own collection types (or perhaps only collections, such as List<T> ), and use a more efficient implementation where applicable.
I have a sample project that I used to demonstrate "LINQ to Objects in an hour," which you can look at examples. This is not a complete implementation and, in particular, it is less efficient than real LINQ for objects, but you still can find her interesting.
Alternatively, you may find that i4o (Indexed LINQ) does everything you need out of the box, or that you will be better off before that, starting from scratch. Worth checking out.
Just remember that at the end of the day, LINQ is basically a nice design combined with syntactic sugar. For example, the C # compiler knows nothing about System.Linq.Enumerable .
If you really need performance, you can do quite a lot. Remember that the following choice:
var result = from element in collection where element.Id == id select element; Compiles as:
var result = collection.Where(element => element.Id == id); If you create the following type for type collection , you can use the fact that the main action is the equality of the Id element and processes the request in an optimized way. It is important to correctly determine the operations that are critical for your activity and choose the right algorithms (i.e. Complexity) for their implementation.
public IEnumerable<TElement> Where(Expression<Func<TElement, bool>> selector) { // detect equality of the Id member and return some special value } Consider System.Linq.Enumerable.Reverse () - this method completely lists IEnumerable before returning the first result.
If your request is myCollection.Reverse (). Take (10) and your collection contains billions of elements, it is a terrible idea to list billions of elements to get 10 of them.
If you provided a Reverse method by type, you can provide a better implementation that just loops back across the collection (possibly by index).
The key to this is to provide your own type in which you control the implementation. You cannot use implementations that work for all IEnumerable<T> , because these implementations will not take full advantage of the capabilities of your custom collection type.
Is it possible to implement one native IEnumerable with possible MoveNext, MovePrev, etc. And their own implemented LINQ extensions that take advantage of them?
IEnumerable (or more correctly, IEnumerator ) does not have MovePrev . You can define an interface:
public interface IReversable<T> : IEnumerable<T> { IEnumerator<T> GetReverseEnumerator(); } This can be implemented by any container that supports efficient backward enumeration.
Then you can write Reverse overload (extension method) to work with this new interface and collection classes that implement the interface, etc. And then you have to use these collection classes instead of standard ones like List<T> .
But (I donβt have a reflector convenient for checking), maybe the built-in Reverse is smart enough to do something quickly if it can get the IList interface from the collection, which optimize the most common cases anyway.
Thus, this approach cannot be many.