Category Archives: Collections & Generics

Collections in .NET

Collection : 

A collection is an object that simply allows you to group other objects. Collections can be strongly typed or weakly typed. Basically if your group of objects is strongly typed, it means that it can contain only objects of a specific type. An advantage when using strongly typed class is that when you access an item in the collection, the type of that item is known, and no additional casting needs to be done.  On the other hand, weakly typed collections object can contain objects of any type.

There are different collection namespaces and collection classes in .NET and it is easy for anyone to get confused by the host of collections available in the .NET Framework.

System.Collections:  This Original Collections: System.Collections namespace are largely considered deprecated by developers and by Microsoft itself. In fact they indicate that for the most part you should always favor the generic or concurrent collections. The System.Collections namespace contains interfaces and classes that define various collections of objects, such as lists, queues, bit arrays, hash tables and dictionaries.  In general, the these collections are non-type-safe and in some cases less performing than their generic counterparts.
System.Collections.Concurrent:  The System.Collections.Concurrent namespace provides several thread-safe collection classes that should be used in place of the corresponding types in the System.Collections and System.Collections.Generic namespaces whenever multiple threads are accessing the collection concurrently. 
System.Collections.Generic:  The System.Collections.Generic namespace contains interfaces and classes that define generic collections, which allow users to create strongly typed collections that provide better type safety and performance than non-generic strongly typed collections. 
System.Collections.ObjectModel:  The System.Collections.ObjectModel namespace contains classes that can be used as collections in the object model of a reusable library. Use these classes when properties or methods return collections. 
System.Collections.Specialized:  The System.Collections.Specialized namespace contains specialized and strongly-typed collections; for example, a linked list dictionary, a bit vector, and collections that contain only strings.

Some of the collection classes frequently used are:

Dictionary: Best for high performance lookups.
SortedDictionary: Compromise of Dictionary speed and ordering, uses binary search tree.
SortedList : Very similar to SortedDictionary, except tree is implemented in an array, so has faster lookup on preloaded data, but slower loads.
List : Best for smaller lists where direct access required and no ordering.
LinkedList : Best for lists where inserting/deleting in middle is common and no direct access required.
HashSet : Unique unordered collection, like a Dictionary except key and value are same object.
SortedSet : Unique ordered collection, like SortedDictionary except key and value are same object.
Stack : Essentially same as List<T> except only process as LIFO
Queue : Essentially same as List<T> except only process as FIFO

System.Collections namespace

Weakly typed collections: summary

ArrayList: An array of contiguous indexed elements whose size can be increased dynamically at run time as required. You can use this data structure to store any types of data like integers, strings, structures, objects
  => A dynamic, contiguous collection of objects.
  => Easy to work with, provides basic collection functionality and can easily be converted to an Array.
  => Favor the generic collection List<T> instead.
Hashtable: Represents a collection of key/value pair elements stored based on the hash code of the key.
  => Associative, unordered collection of key-value pairs of objects.
  => Provides high performance access to items in the key-value pair collection, by a specific key.
  => Favor the generic collection Dictionary<TKey,TValue> instead.
Queue : Represents the data structure of first-in-first-out (FIFO).
  => First-in-first-out (FIFO) collection of objects.
  => Favor the generic collection Queue<T> instead.
SortedList : Represents a list of key/value pair elements, sorted by keys and can be accessed by key and by index.
  => Associative, ordered collection of key-value pairs of objects.
  => Provides access to items in the collection by a specific key or an index.
  => Favor the generic collection SortedList<T> instead.
Stack : Represents the data structure of last-in-first-out (LIFO).
  => Last-in-first-out (LIFO) collection of objects.
  => Favor the generic collection Stack<T> instead.

ArrayList Class:

You would expect the ArrayList is pretty much the same as the Array class, but the major differences are that ArrayList is not strongly typed and the ArrayList’s size can be changed dynamically. ArrayList is weakly typed, so in fact you could add any object to the ArrayList even if the type does not match the type of the objects already in the ArrayList.

The big advantage of an ArrayList is that you don’t need to know in advance how many items you want it to contain. A nice method of the ArrayList is the ToArray function, which is often used to build arrays without knowing in advance how many items they will contain. When you call the ToArray function you need to use type of the items you want to be in the array, as a parameter.

Hashtable Class:

A Hashtable class is a weakly typed collection of key-value pairs. This means that for each object you add to the Hashtable, you need to provide a key that uniquely identifies the object. The Hashtable lets you quickly get an object out of the collection by using it’s key. So why is this collection called a Hashtable? Well, what happens when you add a key with a corresponding object is that of the key object, the hash value is calculated. The hash value is a numeric value that can uniquely identify the key object. So the key object is not stored in the collection, only it’s hash value. To get an item out of your Hashtable, again you use the key object to retrieve it:

You can use the Hashtable if you want to access items in your weakly typed collection, based on a key, not on an index.

Queue Class:

Queue class is available for you in the System.Collections namespace. With the Queue class in .NET you can create weakly typed collections that are ordered by the order they are added to the collection. The mechanism here is FIFO: first in first out.

The use of the Queue class is rather limited. Only in special cases for example when you need to accept messages and process them in the order they arrived, the Queue class can be used.

SortedList Class

The SortedList class is a combination of the Array and the Hashtable. You can access items in a SortedList by the index (like the Array), or by the key (like the Hashtable). Like the name insinuates, a SortedList is sorted. But be aware that it’s sorted based on the key object! The index sequence is based on the sort sequence. Also be aware that a SortedList is generally slower than the Hashtable, due to the sorting. The SortedList class is, like the Hashtable, weakly typed. You can retrieve objects form a SortedList by an index, or by the key. Interesting method of the SortedList class is the GetKey(index), this will retrieve the key object for the item at the specified index. The other way around is the IndexOfKey(key) method. This method gets you the index of the object that is coupled with the specified key.

SortedList are useful if you need a collection of which you can access the items, based on either the key or index. Additional, the items are sorted based on the key value. Be aware that these additional functionalities come at a price performance wise. If you only need a collection based on a key-value pair, and you don’t need to be able to access to items based on the index, you’d better use the Hashtable class.

Stack Class

The Stack class is a weakly typed collection. The last object added to the stack is returned as first. This mechanism here is called LIFO: last in first out. The method that is used to add items to the Stack is Push. Like the Queue class, items in a Stack can’t be accessed directly. To get the last added item of the stack, you have can call the Pop or Peek method. The difference between them is that the Peek method just gets the last object of the stack, the Pop method also get the last object and them removes that item from the stack.

You can use it when you have to implement a LIFO mechanism.

Strongly typed collections:

Array Class: Predefined size, used for passing data around.

One of the most basic collection classes is the Array class. In fact it’s not really a collection class, due to its limitations and its not even located in the System.Collections namespace, but in the System namespace. It is strongly typed. The most important thing about the Array is that it has a fixed size, so when you declare an array, you need to specify the number of items you want to contain. Arrays can have multiple dimensions, to create matrix-like structures.

So when can you use an Array? Pretty obviously, when you know in advance how many items your collection contains and you do not expect much extra functionality like sorting or searching. Arrays are often used when passing a number of objects of the same type to a method or function call as a parameter. By using an Array, you don’t have to worry about the fact that your custom build strongly typed collection needs to be known at the receiving side of the call.

CollectionBase Class:Abstract base for building your own collection classes.

The abstract CollectionBase class provides the base class for a custom strongly typed collection. Because the the class is strongly typed, you can access it’s items without casting them to the right type, so no more CType’s or implicit castings.

DictionaryBase Class: Abstract base for building your own collection classes using a key-value pair.

Well, the CollectionBase class is build upon an Arraylist that is stored internally. The DictionaryBase class uses internally a Hashtable to store items of the collection. But the goal of the DictionaryBase class resembles the goal of the Collectionbase class, they both serve as an abstract base for building your own custom strongly typed collection classes.

For example in a DictionaryBase implementation or Hashtable it’s easy to lookup specific items based on a key, while you only can access items in an ArrayList by using an index. So when choosing the way you want to group your objects, you first have to think about what you want to do with your objects.

 

System.Collections.Generic namespace:

The System.Collections.Generic namespace contains interfaces and classes that define generic collections, which allow users to create strongly typed collections that provide better type safety and performance than non-generic strongly typed collections.

A generic collection is strongly typed (type safe), meaning that you can only put one type of object into it.  This eliminates type mismatches at runtime. Another benefit of type safety is that performance is better with value type objects because they don’t incur overhead of being converted to and from type object. With generic collections, you have the best of all worlds because they are strongly typed, like arrays, and you have the additional functionality, like ArrayList and other non-generic collections, without the problems.

Generics were added to version 2.0 of the C# language and the common language runtime (CLR). Generics introduce the concept of type parameters, which make it possible to design classes and methods that defer the specification of one or more types until the class or method is declared and instantiated by client code. Generic classes and methods combine reusability, type safety and efficiency  in a way that their non-generic counterparts cannot. Generics are most frequently used with collections and the methods that operate on them. Version 2.0 of the .NET Framework class library provides a new namespace, System.Collections.Generic, which contains several new generic-based collection classes.

System.Collections.Generic classes:

•Comparer<T> :  Provides a base class for implementations of the IComparer<T> generic interface. 
•Dictionary<TKey, TValue> :  Represents a collection of keys and values. 
•Dictionary<TKey, TValue>.KeyCollection  Represents the collection of keys in a Dictionary<TKey, TValue>. This class cannot be inherited. 
•Dictionary<TKey, TValue>.ValueCollection  Represents the collection of values in a Dictionary<TKey, TValue>. This class cannot be inherited. 
•EqualityComparer<T> :  Provides a base class for implementations of the IEqualityComparer<T> : generic interface. 
•HashSet<T> :  Represents a set of values. 
•KeyedByTypeCollection<TItem> :  Provides a collection whose items are types that serve as keys. 
KeyNotFoundException  The exception that is thrown when the key specified for accessing an element in a collection does not match any key in the collection. 
•LinkedList<T> :  Represents a doubly linked list. 
•LinkedListNode<T> :  Represents a node in a LinkedList<T>. This class cannot be inherited. 
•List<T> :  Represents a strongly typed list of objects that can be accessed by index. Provides methods to search, sort, and manipulate lists. 
•Queue<T> :  Represents a first-in, first-out collection of objects. 
•SortedDictionary<TKey, TValue> :  Represents a collection of key/value pairs that are sorted on the key. 
•SortedDictionary<TKey, TValue>.KeyCollection  Represents the collection of keys in a SortedDictionary<TKey, TValue>. This class cannot be inherited. 
•SortedDictionary<TKey, TValue>.ValueCollection  Represents the collection of values in a SortedDictionary<TKey, TValue>. This class cannot be inherited 
•SortedList<TKey, TValue> :  Represents a collection of key/value pairs that are sorted by key based on the associated IComparer<T> implementation. 
•SortedSet<T> :  Represents a collection of objects that is maintained in sorted order. 
•Stack<T> :  Represents a variable size last-in-first-out (LIFO) collection of instances of the same arbitrary type. 
•SynchronizedCollection<T> :  Provides a thread-safe collection that contains objects of a type specified by the generic parameter as elements. 
•SynchronizedKeyedCollection<K, T> :  Provides a thread-safe collection that contains objects of a type specified by a generic parameter and that are grouped by keys. 
•SynchronizedReadOnlyCollection<T> :  Provides a thread-safe, read-only collection that contains objects of a type specified by the generic parameter as elements.

System.Collections.Generic Interface:

•ICollection<T>  Defines methods to manipulate generic collections. 
•IComparer<T>  Defines a method that a type implements to compare two objects.
•IDictionary<TKey, TValue>  Represents a generic collection of key/value pairs. 
•IEnumerable<T>  Exposes the enumerator, which supports a simple iteration over a collection of a specified type. 
•IEnumerator<T>  Supports a simple iteration over a generic collection. 
•IEqualityComparer<T>  Defines methods to support the comparison of objects for equality. 
•IList<T>  Represents a collection of objects that can be individually accessed by index.  It Provides behavior to add, remove, and index items in a list of objects. Also, this interface defines members to determine whether the implementing collection type is read-only and/or a fixed-size container
•ISet<T>  Provides the base interface for the abstraction of sets.

System.Collections.Concurrent namespace:

The Concurrent Collections: The concurrent collections are new as of .NET 4.0 and are included in the System.Collections.Concurrent namespace. These collections are optimized for use in situations where multi-threaded read and write access of a collection is desired.
The concurrent queue, stack, and dictionary work much as you’d expect. The bag and blocking collection are more unique.

The System.Collections.Concurrent namespace provides several thread-safe collection classes that should be used in place of the corresponding types in the System.Collections and System.Collections.Generic namespaces whenever multiple threads are accessing the collection concurrently.

Below is the summary of each.
•ConcurrentBag<T>  Represents a thread-safe, unordered collection of objects.  Optimized for situations where a thread may be bother reader and writer. 
•ConcurrentDictionary<TKey, TValue>  Represents a thread-safe collection of key-value pairs that can be accessed by multiple threads concurrently. Optimized for multiple readers (allows multiple readers under same lock).
•ConcurrentQueue<T>  Represents a thread-safe first in-first out (FIFO) collection.
•ConcurrentStack<T>  Represents a thread-safe last in-first out (LIFO) collection. 
•OrderablePartitioner<TSource>  Represents a particular manner of splitting an orderable data source into multiple partitions. 
•Partitioner  Provides common partitioning strategies for arrays, lists, and enumerables. 
•Partitioner<TSource>  Represents a particular manner of splitting a data source into multiple partitions.  
•BlockingCollection<T>  Provides blocking and bounding capabilities for thread-safe collections that implement IProducerConsumerCollection<T>. Wrapper collection that implement producers & consumers paradigm. Readers can block until items are available to read. Writers can block until space is available to write (if bounded).

 

System.Collections.Specialized namespace:

BitVector32: A simple structure that stores Boolean values and small integers in 32 bits of memory.
CollectionsUtil: Creates collections that ignore the case in strings.
HybridDictionary: Implements IDictionary by using a ListDictionary while the collection is small, and then switching to a Hashtable when the collection gets large.
ListDictionary: Implements IDictionary using a singly linked list. Recommended for collections that typically contain ten items or fewer.
NameValueCollection: Represents a sorted collection of associated String keys and String values that can be accessed either with the key or with the index.
StringCollection: Represents a collection of strings.
StringDictionary: Implements a hashtable with the key strongly typed to be a string rather than an object.
StringEnumerator: Supports a simple iteration over a StringCollection.