Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (3.69 MB, 424 trang )
Limitations of generics in C# and other languages
3.6.1
103
Lack of covariance and contravariance
In section 2.3.2, we looked at the covariance of arrays—the fact that an array of a reference type can be viewed as an array of its base type, or an array of any of the interfaces
it implements. Generics don’t support this—they are invariant. This is for the sake of
type safety, as we’ll see, but it can be annoying.
WHY DON’T GENERICS SUPPORT COVARIANCE?
Let’s suppose we have two classes, Animal and Cat, where Cat derives from Animal. In
the code that follows, the array code (on the left) is valid C# 2; the generic code (on
the right) isn’t:
Valid (at compile-time):
Invalid:
Animal[] animals = new Cat[5];
animals[0] = new Animal();
List
animals.Add(new Animal());
The compiler has no problem with the second line in either case, but the first line on
the right causes the error:
error CS0029: Cannot implicitly convert type
'System.Collections.Generic.List
'System.Collections.Generic.List
This was a deliberate choice on the part of the framework and language designers. The
obvious question to ask is why this is prohibited—and the answer lies on the second
line. There is nothing about the second line that should raise any suspicion. After all,
List
you should be able to put a Turtle into any list of animals, for instance. However, the
actual object referred to by animals is a Cat[] (in the code on the left) or a List
(on the right), both of which require that only references to instances of Cat are stored
in them. Although the array version will compile, it will fail at execution time. This was
deemed by the designers of generics to be worse than failing at compile time, which is
reasonable—the whole point of static typing is to find out about errors before the code
ever gets run.
NOTE
So why are arrays covariant? Having answered the question about why
generics are invariant, the next obvious step is to question why arrays are
covariant. According to the Common Language Infrastructure Annotated
Standard (Addison-Wesley Professional, 2003), for the first edition the
designers wished to reach as broad an audience as possible, which included
being able to run code compiled from Java source. In other words, .NET has
covariant arrays because Java has covariant arrays—despite this being a
known “wart” in Java.
So, that’s why things are the way they are—but why should you care, and how can you
get around the restriction?
104
CHAPTER 3
Parameterized typing with generics
WHERE COVARIANCE WOULD BE USEFUL
Suppose you are implementing a platform-agnostic storage system,11 which could run
across WebDAV, NFS, Samba, NTFS, ReiserFS, files in a database, you name it. You may
have the idea of storage locations, which may contain sublocations (think of directories
containing files and more directories, for instance). You could have an interface like this:
public interface IStorageLocation
{
Stream OpenForRead();
...
IEnumerable
}
That all seems reasonable and easy to implement. The problem comes when your
implementation (FabulousStorageLocation for instance) stores its list of sublocations for any particular location as List
expect to be able to either return the list reference directly, or possibly call AsReadOnly to avoid clients tampering with your list, and return the result—but that would
be an implementation of IEnumerable
IEnumerable
Here are some options:
■
■
■
■
■
11
12
Make your list a List
to cast every time you fetch an entry in order to get at your implementationspecific behavior. You might as well not be using generics in the first place.
Implement GetSublocations using the funky new iteration features of C# 2, as
described in chapter 6. That happens to work in this example, because the
interface uses IEnumerable
return an IList
Create a new copy of the list, this time as List
cases (particularly if the interface did require you to return an IList
list returned separate from the internal list. You could even use List.ConvertAll to do it in a single line. It involves copying everything in the list, though,
which may be an unnecessary expense if you trust your callers to use the
returned list reference appropriately.
Make the interface generic, with the type parameter representing the actual type
of storage sublocation being represented. For instance, FabulousStorageLocation might implement IStorageLocation
It looks a little odd, but this recursive-looking use of generics can be quite useful
at times.12
Create a generic helper method (preferably in a common class library) that
converts IEnumerator
derives from TDest.
Yes, another one.
For instance, you might have a type parameter T with a constraint that any instance can be compared to another
instance of T for equality—in other words, something like MyClass
Limitations of generics in C# and other languages
105
When you run into covariance issues, you may need to consider all of these options
and anything else you can think of. It depends heavily on the exact nature of the situation. Unfortunately, covariance isn’t the only problem we have to consider. There’s
also the matter of contravariance, which is like covariance in reverse.
WHERE CONTRAVARIANCE WOULD BE USEFUL
Contravariance feels slightly less intuitive than covariance, but it does make sense.
Where covariance is about declaring that we will return a more specific object from a
method than the interface requires us to, contravariance is about being willing to
accept a more general parameter.
For instance, suppose we had an IShape interface13 that contained the Area property. It’s easy to write an implementation of IComparer
We’d then like to be able to write the following code:
IComparer
List
circles.Add(new Circle(20));
circles.Add(new Circle(10));
circles.Sort(areaComparer);
That won’t work, though, because the Sort method on List
an IComparer
rather than just circles doesn’t impress the compiler at all. It considers IComparer
it? It would be nice if the Sort method had this signature instead:
void Sort(IComparer comparer) where T : S
Unfortunately, not only is that not the signature of Sort, but it can’t be—the constraint is invalid, because it’s a constraint on T instead of S. We want a derivation type
constraint but in the other direction, constraining the S to be somewhere up the
inheritance tree of T instead of down.
Given that this isn’t possible, what can we do? There are fewer options this time
than before. First, you could create a generic class with the following declaration:
ComparisonHelper
where TDerived : TBase
You’d then create a constructor that takes (and stores) an IComparer
parameter. The implementation of IComparer
of calling the Compare method of the IComparer
List
area comparison.
The second option is to make the area comparison class generic, with a derivation
constraint, so it can compare any two values of the same type, as long as that type
implements IShape. Of course, you can only do this when you’re able to change the
comparison class—but it’s a nice solution when it’s available.
13
You didn’t really expect to get through the whole book without seeing a shape-related example, did you?
106
CHAPTER 3
Parameterized typing with generics
Notice that the various options for both covariance and contravariance use more
generics and constraints to express the interface in a more general manner, or to provide generic “helper” methods. I know that adding a constraint makes it sound less
general, but the generality is added by first making the type or method generic. When
you run into a problem like this, adding a level of genericity somewhere with an
appropriate constraint should be the first option to consider. Generic methods (rather
than generic types) are often helpful here, as type inference can make the lack of variance invisible to the naked eye. This is particularly true in C# 3, which has stronger
type inference capabilities than C# 2.
NOTE
Is this really the best we can do?—As we’ll see later, Java supports covariance
and contravariance within its generics—so why can’t C#? Well, a lot of it
boils down to the implementation—the fact that the Java runtime
doesn’t get involved with generics; it’s basically a compile-time feature.
However, the CLR does support limited generic covariance and contravariance, just on interfaces and delegates. C# doesn’t expose this feature
(neither does VB.NET), and none of the framework libraries use it. The
C# compiler consumes covariant and contravariant interfaces as if they
were invariant. Adding variance is under consideration for C# 4,
although no firm commitments have been made. Eric Lippert has written
a whole series of blog posts about the general problem, and what might
happen in future versions of C#: http://
blogs.msdn.com/ericlippert/
archive/tags/Covariance+and+Contravariance/default.aspx.
This limitation is a very common cause of questions on C# discussion groups. The
remaining issues are either relatively academic or affect only a moderate subset of the
development community. The next one mostly affects those who do a lot of calculations (usually scientific or financial) in their work.
3.6.2
Lack of operator constraints or a “numeric” constraint
C# is not without its downside when it comes to heavily mathematical code. The need
to explicitly use the Math class for every operation beyond the simplest arithmetic and
the lack of C-style typedefs to allow the data representation used throughout a program to be easily changed have always been raised by the scientific community as barriers to C#’s adoption. Generics weren’t likely to fully solve either of those issues, but
there’s a common problem that stops generics from helping as much as they could
have. Consider this (illegal) generic method:
public T FindMean
{
T sum = default(T);
int count = 0;
foreach (T datum in data)
{
sum += datum;
count++;
}
Limitations of generics in C# and other languages
107
return sum/count;
}
Obviously that could never work for all types of data—what could it mean to add one
Exception to another, for instance? Clearly a constraint of some kind is called for…
something that is able to express what we need to be able to do: add two instances of T
together, and divide a T by an integer. If that were available, even if it were limited to
built-in types, we could write generic algorithms that wouldn’t care whether they were
working on an int, a long, a double, a decimal, and so forth. Limiting it to the builtin types would have been disappointing but better than nothing. The ideal solution
would have to also allow user-defined types to act in a numeric capacity—so you could
define a Complex type to handle complex numbers, for instance. That complex number could then store each of its components in a generic way as well, so you could
have a Complex
Two related solutions present themselves. One would be simply to allow constraints on operators, so you could write a set of constraints such as
where T : T operator+ (T,T), T operator/ (T, int)
This would require that T have the operations we need in the earlier code. The other
solution would be to define a few operators and perhaps conversions that must be supported in order for a type to meet the extra constraint—we could make it the
“numeric constraint” written where T : numeric.
One problem with both of these options is that they can’t be expressed as normal
interfaces, because operator overloading is performed with static members, which
can’t implement interfaces. It would require a certain amount of shoehorning, in
other words.
Various smart people (including Eric Gunnerson and Anders Hejlsberg, who
ought to be able to think of C# tricks if anyone can) have thought about this, and with
a bit of extra code, some solutions have been found. They’re slightly clumsy, but they
work. Unfortunately, due to current JIT optimization limitations, you have to pick
between pleasant syntax (x=y+z) that reads nicely but performs poorly, and a methodbased syntax (x=y.Add(z)) that performs without significant overhead but looks like a
dog’s dinner when you’ve got anything even moderately complicated going on.
The details are beyond the scope of this book, but are very clearly presented at
http:/
/www.lambda-computing.com/publications/articles/generics2/ in an article on
the matter.
The two limitations we’ve looked at so far have been quite practical—they’ve been
issues you may well run into during actual development. However, if you’re generally
curious like I am, you may also be asking yourself about other limitations that don’t
necessarily slow down development but are intellectual curiosities. In particular, just
why are generics limited to types and methods?
14
More mathematically minded readers might want to consider what a Complex
mean. You’re on your own there, I’m afraid.
108
3.6.3
CHAPTER 3
Parameterized typing with generics
Lack of generic properties, indexers, and other member types
We’ve seen generic types (classes, structs, delegates, and interfaces) and we’ve seen
generic methods. There are plenty of other members that could be parameterized.
However, there are no generic properties, indexers, operators, constructors, finalizers, or events. First let’s be clear about what we mean here: clearly an indexer can have
a return type that is a type parameter—List
an indexer or property (or any of the other members in that list) with extra type
parameters. Leaving the possible syntax of declaration aside for the minute, let’s look
at how these members might have to be called:
SomeClass
int x = instance.SomeProperty
byte y = instance.SomeIndexer
instance.Click
instance = instance +
I hope you’ll agree that all of those look somewhat silly. Finalizers can’t even be called
explicitly from C# code, which is why there isn’t a line for them. The fact that we can’t
do any of these isn’t going to cause significant problems anywhere, as far as I can
see—it’s just worth being aware of it as an academic limitation.
The one exception to this is possibly the constructor. However, a static generic
method in the class is a good workaround for this, and the syntax with two lists of type
arguments is horrific.
These are by no means the only limitations of C# generics, but I believe they’re the
ones that you’re most likely to run up against, either in your daily work, in community
conversations, or when idly considering the feature as a whole. In our next two sections we’ll see how some aspects of these aren’t issues in the two languages whose features are most commonly compared with C#’s generics: C++ (with templates) and Java
(with generics as of Java 5). We’ll tackle C++ first.
3.6.4
Comparison with C++ templates
C++ templates are a bit like macros taken to an extreme level. They’re incredibly powerful, but have costs associated with them both in terms of code bloat and ease of
understanding.
When a template is used in C++, the code is compiled for that particular set of template arguments, as if the template arguments were in the source code. This means that
there’s not as much need for constraints, as the compiler will check whether you’re
allowed to do everything you want to with the type anyway while it’s compiling the code
for this particular set of template arguments. The C++ standards committee has recognized that constraints are still useful, though, and they will be present in C++0x (the
next version of C++) under the name of concepts.
The C++ compiler is smart enough to compile the code only once for any given set
of template arguments, but it isn’t able to share code in the way that the CLR does with
Limitations of generics in C# and other languages
109
reference types. That lack of sharing does have its benefits, though—it allows typespecific optimizations, such as inlining method calls for some type parameters but not
others, from the same template. It also means that overload resolution can be performed separately for each set of type parameters, rather than just once based solely
on the limited knowledge the C# compiler has due to any constraints present.
Don’t forget that with “normal” C++ there’s only one compilation involved, rather
than the “compile to IL” then “JIT compile to native code” model of .NET. A program
using a standard template in ten different ways will include the code ten times in a C++
program. A similar program in C# using a generic type from the framework in ten different ways won’t include the code for the generic type at all—it will refer to it, and the
JIT will compile as many different versions as required (as described in section 3.4.2) at
execution time.
One significant feature that C++ templates have over C# generics is that the template
arguments don’t have to be type names. Variable names, function names, and constant
expressions can be used as well. A common example of this is a buffer type that has the
size of the buffer as one of the template arguments—so a buffer
be a buffer of 20 integers, and a buffer
This ability is crucial to template metaprogramming 15 —an15advanced C++ technique the
very idea of which scares me, but that can be very powerful in the hands of experts.
C++ templates are more flexible in other ways, too. They don’t suffer from the
problem described in 3.6.2, and there are a few other restrictions that don’t exist in
C++: you can derive a class from one of its type parameters, and you can specialize a
template for a particular set of type arguments. The latter ability allows the template
author to write general code to be used when there’s no more knowledge available
but specific (often highly optimized) code for particular types.
The same variance issues of .NET generics exist in C++ templates as well—an
example given by Bjarne Stroustrup16 is that there are no implicit conversions
between Vector
it might allow you to put a square peg in a round hole.
For further details of C++ templates, I recommend Stroustrup’s The C++
Programming Language (Addison-Wesley, 1991). It’s not always the easiest book to
follow, but the templates chapter is fairly clear (once you get your mind around C++
terminology and syntax). For more comparisons with .NET generics, look at the blog
post by the Visual C++ team on this topic: http:/
/blogs.msdn.com/branbray/
archive/2003/11/19/51023.aspx.
The other obvious language to compare with C# in terms of generics is Java, which
introduced the feature into the mainstream language for the 1.5 release,17 several
years after other projects had compilers for their Java-like languages.
15
16
17
http://
en.wikipedia.org/wiki/Template_metaprogramming
The inventor of C++.
Or 5.0, depending on which numbering system you use. Don’t get me started.
110
3.6.5
CHAPTER 3
Parameterized typing with generics
Comparison with Java generics
Where C++ includes more of the template in the generated code than C# does, Java
includes less. In fact, the Java runtime doesn’t know about generics at all. The Java
bytecode (roughly equivalent terminology to IL) for a generic type includes some
extra metadata to say that it’s generic, but after compilation the calling code doesn’t
have much to indicate that generics were involved at all—and certainly an instance of
a generic type only knows about the nongeneric side of itself. For example, an
instance of HashSet
a HashSet