Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (5.14 MB, 519 trang )
Bang (!) methods and “danger”
191
Dangerous can mean whatever the person writing the method wants it to mean. In
the case of the built-in classes, it usually means this method, unlike its non-bang equivalent,
permanently modifies its receiver. It doesn’t always, though: exit! is a dangerous alternative to exit, in the sense that it doesn’t run any finalizers on the way out of the program. The danger in sub! (a method that substitutes a replacement string for a
matched pattern in a string) is partly that it changes its receiver and partly that it
returns nil if no change has taken place—unlike sub, which always returns a copy of
the original string with the replacement (or no replacement) made.
If “danger” is too melodramatic for you, you can think of the ! in method names as
a kind of “Heads up!” And, with very few, very specialized exceptions, every bang
method should occur in a pair with a non-bang equivalent. We’ll return to questions
of best method-naming practice after we’ve looked at some bang methods in action.
7.3.1
Destructive (receiver-changing) effects as danger
No doubt most of the bang methods you’ll come across in the core Ruby language
have the bang on them because they’re destructive: they change the object on which
they’re called. Calling upcase on a string gives you a new string consisting of the original string in uppercase; but upcase! turns the original string into its own uppercase
equivalent, in place:
>>
=>
>>
=>
>>
=>
>>
=>
>>
=>
str = "Hello"
"Hello"
str.upcase
"HELLO"
str
"Hello"
str.upcase!
"HELLO"
str
"HELLO"
B
C
Examining the original string after converting it to uppercase shows that the uppercase version was a copy; the original string is unchanged B. But the bang operation
has changed the content of str itself C.
Ruby’s core classes are full of destructive (receiver-changing) bang methods paired
with their non-destructive counterparts: sort/sort! for arrays, strip/strip! (strip
leading and trailing whitespace) for strings, reverse/reverse! for strings and arrays,
and many more. In each case, if you call the non-bang version of the method on the
object, you get a new object. If you call the bang version, you operate in-place on the
same object to which you sent the message.
In the rest of the book, you’ll see mention made several times of methods that
have bang equivalents. Unless otherwise specified, that means the bang version of the
method replaces the original content of the object with the results of the method call.
Again, no rule says that this is the case, but it’s a common scenario.
You should always be aware of whether the method you’re calling changes its
receiver. Neither option is always right or wrong; which is best depends on what you’re
Licensed to sam kaplan
192
CHAPTER 7
Built-in essentials
doing. One consideration, weighing in on the side of modifying objects instead of creating new ones, is efficiency: creating new objects (like a second string that’s identical
to the first except for one letter) is expensive, in terms of memory and processing.
This doesn’t matter much if you’re dealing with a small number of objects. But when
you get into, say, handling data from large files and using loops and iterators to do so,
creating new objects can be a drain on resources.
On the other hand, you need to be cautious about modifying objects in place,
because other parts of the program may depend on those objects not to change. For
example, let’s say you have a database of names. You read the names out of the database into an array. At some point, you need to process the names for printed output—
all in capital letters. You may do something like this:
names.each do |name|
capped = name.upcase
# ...code that does something with capped...
end
In this example, capped is a new object: an uppercase duplicate of name. When you go
through the same array later, in a situation where you do not want the names in uppercase, such as saving them back to the database, the names will be the way they were
originally.
By creating a new string (capped) to represent the uppercase version of each
name, you avoid the side effect of changing the names permanently. The operation
you perform on the names achieves its goals without changing the basic state of the
data. Sometimes you’ll want to change an object permanently, and sometimes you’ll
want not to; there’s nothing wrong with that, as long as you know which you’re doing
and why.
Furthermore, don’t assume a direct correlation between bang methods and
destructive methods. They often coincide, but they’re not the same thing.
7.3.2
Destructiveness and “danger” vary independently
What follows here is some commentary on conventions and best practices. Ruby
doesn’t care; Ruby is happy to execute methods whose names end in ! whether
they’re dangerous, safe, paired with a non-bang method, not paired—whatever. The
value of the ! notation as a token of communication between a method author and a
user of that method resides entirely in conventions. It’s worth gaining a solid understanding of those conventions, and why they make sense.
The best advice on when to use bang-terminated method names is…
DON’T USE ! EXCEPT IN M/M! METHOD PAIRS
The ! notation for a method name should only be used when there is a method of the same
name without the !, when the relation between those two methods is that they both do
substantially the same thing, and when the bang version also has side effects, a different return value, or some other behavior that diverges from its non-bang counterpart.
Don’t use the ! just because you think your method is dangerous in some vague,
abstract way. All methods do something; that in itself isn’t dangerous. The ! is a warning
Licensed to sam kaplan
Built-in and custom to_* (conversion) methods
193
that there may be more going on than the name suggests—and that, in turn, makes
sense only if the name is in use for a method that doesn’t have the dangerous behavior.
Don’t call a method save! just because it writes text from an object to a file. Call
that method save; and then, if you have another method that writes text from an
object to a file but (say) doesn’t back up the original file (assuming that save does so),
go ahead and call that one save!.
If you find yourself writing one method to write to the file, and you put a ! at the
end because you’re worried the method is too powerful or too unsafe, you should
reconsider your method naming. Any experienced programmer who sees a save!
method documented is going to want to know how it differs from save. The exclamation point doesn’t mean anything in isolation; it only makes sense at the end of one of
a pair of otherwise identical method names.
DON’T EQUATE ! NOTATION WITH DESTRUCTIVE BEHAVIOR, OR VICE VERSA
Due to the fact that danger in the bang sense usually means destructive, it’s not
uncommon to hear people assert that the ! means destructive. (In some programming languages, that’s the case.) From there, it’s not much of a leap to start wondering why some destructive methods don’t end with !.
This line of thinking is problematic from the start. The bang doesn’t mean
destructive; it means dangerous, unexpected receiver-changing behavior. If you have a
method called upcase and you want to write a destructive version of it, you’re free to
call it destructive_upcase; no rule says you have to add a ! to the original name. It’s
just a convention, but it’s an expressive one.
Moreover, destructive methods do not always end with !, nor would that make
sense. Many non-bang methods have names that lead you to expect the receiver to
change. These methods have no non-destructive counterparts. (What would it mean
to have a non-destructive version of String#clear, which removes all characters from
a string and leaves it equal to ""? If you’re not changing the string in place, why
wouldn’t you just write "" in the first place?) If a method name without a bang already
suggests in-place modification or any other kind of “dangerous” behavior, then it’s not
a dangerous method.
You’ll almost certainly find that the conventional usage of the ! notation is the
most elegant and logical usage. It’s best not to slap bangs on names unless you’re playing along with those conventions.
Leaving danger behind us, we’ll look next at the facilities Ruby provides for converting one object to another.
7.4
Built-in and custom to_* (conversion) methods
Ruby offers a number of built-in methods whose names consist of to_ plus an indicator of a class to which the method converts an object: to_s (to string), to_sym (to symbol), to_a (to array), to_i (to integer), and to_f (to float). Not all objects respond to
all of these methods. But many objects respond to a lot of them, and the principle is
consistent enough to warrant looking at them collectively.
Licensed to sam kaplan
194
7.4.1
CHAPTER 7
Built-in essentials
String conversion: to_s
The most commonly used to_ method is probably to_s. Every Ruby object responds
to to_s; every Ruby object has a way of displaying itself as a string. What to_s does, as
the following irb excerpts show, ranges from nothing other than return its own
receiver, when the object is already a string:
>> "I am already a string!".to_s
=> "I am already a string!"
to returning a string containing a code-like representation of an object:
>> ["one", "two", "three", 4, 5, 6].to_s
=> "[\"one\", \"two\", \"three\", 4, 5, 6]"
(where the backslash-escaped quotation marks mean there’s a literal quotation mark
inside the string), to returning an informative, if cryptic, descriptive string about an
object:
>> Object.new.to_s
=> "#
The salient point about to_s is that it’s used by certain methods and in certain syntactic contexts to provide a canonical string representation of an object. The puts
method, for example, calls to_s on its arguments. If you write your own to_s for a
class or override it on an object, your to_s will surface when you give your object to
puts. You can see this clearly, if a bit nonsensically, using a generic object:
>> obj = Object.new
=> #
>> puts obj
#
=> nil
>> def obj.to_s
>> "I'm an object!"
>> end
=> nil
>> puts obj
I'm an object!
=> nil
B
C
D
E
The object’s default string representation is the usual class and memory location
screen dump B. When you call puts on the object C, that’s what you see. But if you
define a custom to_s method on the object D, subsequent calls to puts reflect the
new definition E.
You also get the output of to_s when you use an object in string interpolation:
>> "My object says: #{obj}"
=> "My object says: I'm an object!"
Don’t forget, too, that you can call to_s explicitly. You don’t have to wait for Ruby to
go looking for it. But a large percentage of calls to to_s are automatic, behind-thescenes calls on behalf of puts or the interpolation mechanism.
Licensed to sam kaplan
Built-in and custom to_* (conversion) methods
NOTE
195
When it comes to generating string representations of their instances,
some built-in classes do things a little differently from the defaults. For
example, if you call puts on an array, you get a cyclical representation
based on calling to_s on each of the elements in the array and outputting one per line. That is a special behavior; it doesn’t correspond to
what you get when you call to_s on an array, namely a string representation of the array in square brackets, as an array literal.
While we’re looking at string representations of objects, let’s examine a few related
methods. We’re drifting a bit from the to_* category, perhaps; but these are all methods that generate strings from objects, and a consideration of them is therefore timely.
Like puts, the built-in inspect method piggybacks on to_s. Out of the box, these
three expressions always do the same thing:
puts object
puts object.to_s
puts object.inspect
So why have inspect?
BORN TO BE OVERRIDDEN: inspect
The reason for having inspect as well as to_s is so that you can override it, independently of whether you override to_s. You don’t always need to do so, of course. But
you may want an object to have one string representation for interpolation in strings
and other public-facing contexts, and another, perhaps more introspective, representation for programmer inspection. Or, as in the case of the built-in Regexp (regular
expression) class in Ruby, you may want two string representations that convey different information without one necessarily being more public-friendly than the other.
Here’s how regular expressions display themselves:
>> re = /\(\d{3}\) \d{3}-\d{4}/
=> /\(\d{3}\) \d{3}-\d{4}/
>> puts re
(?-mix:\(\d{3}\) \d{3}-\d{4})
=> nil
>> puts re.inspect
/\(\d{3}\) \d{3}-\d{4}/
=> nil
C
B
D
E
The regular expression re matches a typical United States phone number of the form
(nnn) nnn-nnnn (where n is a digit) B. To start with, irb prints out the value of the
assignment, which as always is its right-hand side—so already we’re seeing irb’s version
of a string representation of a regular expression C. But look at what we get from puts
re: a different-looking string, including some metadata about the object D. (-mix
means we haven’t used any of the m, i, or x modifiers. Don’t worry for now about what
that means.)
Asking explicitly for a printout of re.inspect E takes us back to what we got
from irb in the first place, because irb uses inspect to generate string representations of objects.
Licensed to sam kaplan
196
CHAPTER 7
Built-in essentials
The p method provides another variation on the string-conversion and -representation theme.
WRAPPING inspect IN p
The following two statements are almost equivalent:
puts obj.inspect
p obj
You can see the difference clearly if you look at the return values in irb:
>> array = [1,2,3,4]
=> [1, 2, 3, 4]
>> puts array.inspect
[1, 2, 3, 4]
=> nil
>> p array
[1, 2, 3, 4]
=> [1, 2, 3, 4]
Both commands cause the inspect method to be called on array and its result to be
printed. The call to puts returns nil (the nil we usually ignore when irb reports it),
but p returns array.
Another, less frequently used method generates and displays a string representation of an object: display.
USING display
You won’t see display much. It occurs only once, at last count, in all the Ruby program files in the entire standard library. (inspect occurs 160 times.) It’s a specialized
output method.
display takes an argument: a writable output stream, in the form of a Ruby I/O
object. By default, it uses STDOUT, the standard output stream:
>> "Hello".display
Hello=> nil
Note that display, unlike puts but like print, doesn’t automatically insert a newline
character. That’s why => nil is run together on one line with the output.
You can redirect the output of display by providing, for example, an open filehandle as an argument:
>> fh = File.open("/tmp/display.out", "w")
=> #
>> "Hello".display(fh)
=> nil
>> fh.close
=> nil
>> puts(File.read("/tmp/display.out"))
Hello
B
C
The string “Hello” is “displayed” directly to the file B, as we confirm by reading the
contents of the file in and printing them out C.
Let’s leave string territory at this point and look at how conversion techniques play
out in the case of the Array class.
Licensed to sam kaplan
Built-in and custom to_* (conversion) methods
7.4.2
197
Array conversion with to_a and the * operator
The to_a (to array) method, if defined, provides an array-like representation of
objects. One of to_a’s most striking features is that it automatically ties in with the *
operator. The * operator (pronounced “star,” “unarray,” or, among the whimsically
inclined, “splat”) does a kind of unwrapping of its operand into its components, those
components being the elements of its array representation.
You’ve already seen the star operator used in method parameter lists, where it
denotes a parameter that sponges up the optional arguments into an array. In the
more general case, the star turns any array, or any object that responds to to_a, into
the equivalent of a bare list.
The term bare list means several identifiers or literal objects separated by commas.
Bare lists are valid syntax only in certain contexts. For example, you can put a bare list
inside the literal array constructor brackets:
[1,2,3,4,5]
It’s a subtle distinction, but the notation lying between the brackets isn’t, itself, an
array; it’s a list, and the array is constructed from the list, thanks to the brackets.
The star has a kind of bracket-removing or un-arraying effect. What starts as an
array becomes a list. You can see this if you construct an array from a starred array:
>>
=>
>>
=>
array = [1,2,3,4,5]
[1, 2, 3, 4, 5]
[*array]
[1, 2, 3, 4, 5]
The array in array has been demoted, so to speak, from an array to a bare list, courtesy of the star. Compare this with what happens if you don’t use the star:
>> [array]
=> [[1, 2, 3, 4, 5]]
Here, the list from which the new array gets constructed contains one item: the object
array. That object has not been mined for its inner elements, as it was in the example
with the star.
One implication is that you can use the star in front of a method argument to turn
it from an array into a list. You do this in cases where you have objects in an array that
you need to send to a method that’s expecting a broken-out list of arguments:
def combine_names(first_name, last_name)
first_name + " " + last_name
end
names = ["David", "Black"]
puts combine_names(*names)
Output: David Black
If you don’t use the un-arraying star, you’ll send just one argument—an array—to the
method, and the method won’t be happy.
Let’s turn to numbers.
Licensed to sam kaplan
198
7.4.3
CHAPTER 7
Built-in essentials
Numerical conversion with to_i and to_f
Unlike some programming languages, such as Perl, Ruby doesn’t automatically convert from strings to integers. You can’t do this
x = "We're number "
y = 1
puts x + y
TypeError: can’t convert
Fixnum into String
because Ruby doesn’t know how to add a string and an integer together. Similarly,
you’ll get a surprise if you do this:
print "Enter a number: "
n = gets.chomp
puts n * 100
You’ll see the string version of the number printed out 100 times. (This result also tells
you that Ruby lets you multiply a string—but it’s always treated as a string, even if it
consists of digits.) If you want the number, you have to turn it into a number explicitly:
n = gets.to_i
As you’ll see if you experiment with converting strings to integers (which you can do
easily in irb with expressions like "hello".to_i), the to_i conversion value of strings
that have no reasonable integer equivalent (including “hello”) is always 0. If your
string starts with digits but isn’t made up entirely of digits (“123hello”), the non-digit
parts are ignored and the conversion is performed only on the leading digits.
The to_f (to float) conversion gives you, predictably, a floating-point equivalent of any integer. The rules pertaining to non-conforming characters are similar
to those governing string-to-integer conversions: "hello".to_f is 0.0, whereas
"1.23hello".to_f is 1.23. If you call to_f on a float, you get the same float back.
Similarly, calling to_i on an integer returns that integer.
If the conversion rules for strings seem a little lax to you—if you don’t want strings
like “-5xyz” to succeed in converting themselves to integers or floats, you have a couple of stricter conversion techniques available to you.
STRICTER CONVERSIONS WITH Integer AND Float
Ruby provides methods called Integer and Float (and yes, they look like constants,
but they’re methods with names that coincide with those of the classes to which they
convert). These methods are similar to to_i and to_f, respectively, but a little
stricter: if you feed them anything that doesn’t conform to the conversion target type,
they raise an exception:
>> "123abc".to_i
=> 123
>> Integer("123abc")
ArgumentError: invalid value for Integer: "123abc"
>> Float("3")
=> 3.0
>> Float("-3")
=> -3.0
>> Float("-3xyz")
ArgumentError: invalid value for Float(): "-3xyz"
Licensed to sam kaplan
Built-in and custom to_* (conversion) methods
199
(Note that converting from an integer to a float is acceptable. It’s the letters that cause
the problem.)
If you want to be strict about what gets converted and what gets rejected, Integer
and Float can help you out.
Conversion vs. typecasting
When you call methods like to_s, to_i, and to_f, the result is a new object (or the
receiver, if you’re converting it to its own class). It’s not quite the same as typecasting in C and other languages. You’re not using the object as a string or an integer;
you’re asking the object to provide a second object that corresponds to its idea of
itself (so to speak) in one of those forms.
The distinction between conversion and typecasting touches on some important
aspects of the heart of Ruby. In a sense, all objects are typecasting themselves constantly. Every time you call a method on an object, you’re asking the object to behave
as a particular type. Correspondingly, an object’s “type” is really the aggregate of
everything it can do at a particular time.
The closest Ruby gets to traditional typecasting (and it isn’t very close) is the roleplaying conversion methods, described in section 7.4.4.
Getting back to the to_* family of converters: in addition to the straightforward
object-conversion methods, Ruby gives you a couple of to_* methods that have a little
extra intelligence about what their value is expected to do.
7.4.4
The role-playing to_* methods
It’s somewhat against the grain in Ruby programming to worry much about what class
an object belongs to. All that matters is what the object can do—what methods it can
execute.
But in a few cases involving the core classes strict attention is paid to the class of
objects. Don’t think of this as a blueprint for “the Ruby way” of thinking about objects.
It’s more like an expedient that bootstraps you into the world of the core objects in
such a way that once you get going, you can devote less thought to your objects’ class
memberships.
STRING ROLE-PLAYING WITH to_str
If you want to print an object, you can define a to_s method for it or use whatever
to_s behavior it’s been endowed with by its class. But what if you need an object to be
a string?
The answer is that you define a to_str method for the object. An object’s to_str
representation enters the picture when you call a core method that requires that its
argument be a string.
The classic example is string addition. Ruby lets you add two strings together, producing a third string:
>> "Hello " + "there."
=> "Hello there."
Licensed to sam kaplan
200
CHAPTER 7
Built-in essentials
If you try to add a non-string to a string, you get an error:
>> "Hello " + 10
TypeError: can't convert Fixnum into String
This is where to_str comes in. If an object responds to to_str, its to_str representation will be used when the object is used as the argument to String#+.
Here’s an example involving a simple Person class. The to_str method is a wrapper around the name method:
class Person
attr_accessor :name
def to_str
name
end
end
If you create a Person object and add it to a string, to_str kicks in with the name
string:
david = Person.new
david.name = "David"
puts "david is named " + david + "."
Output: david is
named David.
The to_str conversion is also used by the << (append to string) method. And arrays,
like strings, have a role-playing conversion method.
ARRAY ROLE-PLAYING WITH to_ary
Objects can masquerade as arrays if they have a to_ary method. If such a method is
present, it’s called on the object in cases where an array, and only an array, will do—
for example, in an array-concatenation operation.
Here’s another Person implementation, where the array role is played by an array
containing three person attributes:
class Person
attr_accessor :name, :age, :email
def to_ary
[name, age, email]
end
end
Concatenating a Person object to an array has the effect of adding the name, age, and
email values to the target array:
david = Person.new
david.name = "David"
david.age = 49
david.email = "david@wherever"
array = []
array.concat(david)
p array
Output:
["David", 49, "david@wherever"]
Licensed to sam kaplan
Boolean states, boolean objects, and nil
201
Like to_str, to_ary provides a way for an object to function momentarily as an object
of a particular core class.
We’ll turn now to the subject of boolean states and objects in Ruby, a topic we’ve
dipped into already but which merits closer inquiry.
7.5
Boolean states, boolean objects, and nil
Every expression in Ruby evaluates to an object, and every object has a boolean value
of either true or false. Furthermore, true and false are objects. This idea isn’t as convoluted as it sounds. If true and false weren’t objects, then a pure boolean expression like
100 > 80
would have no object to evaluate to. (And > is a method and therefore has to return an
object.)
In many cases where you want to get at a truth/falsehood value, such as an if statement or a comparison between two numbers, you don’t have to manipulate these special objects directly. In such situations, you can think of truth and falsehood as states,
rather than objects.
Still, you need to be aware of the existence of the objects true and false, partly
because you may need them in your own code and partly because you may see code
like this usage example from the documentation for a popular object-relational mapping library (ActiveRecord):
create_table :authors do |t|
t.column :name, :string, :null => false
end
You should recognize instantly that the word false represents the special object false
and isn’t a variable or method name.
We’ll look at true and false both as states and as special objects, along with the special object nil.
7.5.1
True and false as states
Every expression in Ruby is either true or false, in a logical or boolean sense. The best
way to get a handle on this is to think in terms of conditional statements. For every
expression e in Ruby, you can do this:
if
e
and Ruby can make sense of it.
For lots of expressions, a conditional test is a stretch; but it can be instructive to try
it on a variety of expressions, as listing 7.1 shows.
Listing 7.1
Testing the boolean value of expressions using if constructs
if (class MyClass; end)
puts "Empty class definition is true!"
B
Licensed to sam kaplan