2 Modifying Ruby’s core classes and modules

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (5.14 MB, 519 trang )

Modifying Ruby’s core classes and modules

387

It may be tempting to do something like this, in order to avoid the error:

class Regexp

alias __old_match__ match

def match(string)

__old_match__(string) || []

end

end

B

This code first sets up an alias for match, courtesy of the alias keyword B. Then, the

code redefines match. The new match hooks into the original version of match

(through the alias) and then returns either the result of calling the original version or

(if that call returns nil) an empty array.

NOTE

An alias is a synonym for a method name. Calling a method by an alias

doesn’t involve any change of behavior or any alteration of the methodlookup process. The choice of alias name in the previous example is

based on a fairly conventional formula: the addition of the word old plus

the leading and trailing underscores. (A case could be made that the formula is too conventional and that you should create names that are less

likely to be chosen by other overriders who also know the convention!)

You can now do this:

/abc/.match("X")[1]

Even though the match fails, the program won’t blow up, because the failed match

now returns an empty array rather than nil. The worst you can do with the new match

is try to index an empty array, which is legal. (The result of the index operation will be

nil, but at least you’re not trying to index nil.)

The problem is that the person using your code may depend on the match operation to return nil on failure:

if regexp.match(string)

do something

else

do something else

end

Because an array (even an empty one) is true, whereas nil is false, returning an array

for a failed match operation means that the true/false test (as embodied in an if/

else statement) always returns true.

Maybe changing Regexp#match so as not to return nil on failure is something your

instincts would tell you not to do anyway. And no one advocates doing it; it’s more that

some new Ruby users don’t connect the dots and therefore don’t see that changing a

core method in one place changes it everywhere.

Another common example, and one that’s a little more subtle (both as to what it

does and as to why it’s not a good idea), involves the String#gsub! method.

Licensed to sam kaplan

388

CHAPTER 13

Object individuation

THE RETURN VALUE OF STRING#GSUB! AND WHY IT SHOULD STAY THAT WAY

As you’ll recall, String#gsub! does a global replace operation on its receiver, saving

the changes in the original object:

>>

=>

>>

=>

>>

=>

string = "Hello there!"

"Hello there!"

string.gsub!(/e/, "E")

"HEllo thErE!"

string

"HEllo thErE!"

B

C

As you can see, the return value of the call to gsub! is the string object with the

changes made B. (And examining the object again via the variable string confirms

that the changes are indeed permanent C.)

Interestingly, though, something different happens when the gsub! operation

doesn’t result in any changes to the string:

>>

=>

>>

=>

>>

=>

string = "Hello there!"

"Hello there!"

string.gsub!(/zzz/, "xxx")

nil

string

"Hello there!"

There’s no match on /zzz/, so the string isn’t changed—and the return value of the

call to gsub! is nil.

Like the nil return from a match operation, the nil return from gsub! has the

potential to make things blow up when you’d rather they didn’t. Specifically, it means

you can’t use gsub! reliably in a chain of methods:

>>

=>

>>

=>

string = "Hello there!"

"Hello there!"

string.gsub!(/e/, "E").reverse!

"!ErEht ollEH"

B

C

>> string = "Hello there!"

=> "Hello there!"

>> string.gsub!(/zzz/, "xxx").reverse!

NoMethodError: undefined method `reverse!' for nil:NilClass

D

This example does something similar (but not quite the same) twice. The first time

through, the chained calls to gsub! and reverse! B return the newly gsub!’d and

reversed string C. But the second time, the chain of calls results in a fatal error D:

the gsub! call didn’t change the string, so it returned nil—which means we called

reverse! on nil rather than on a string.

One possible way of handling the inconvenience of having to work around the nil

return from gsub! is to take the view that it’s not usually appropriate to chain method

calls together too much anyway. And you can always avoid chain-related problems if

you don’t chain:

Licensed to sam kaplan

Modifying Ruby’s core classes and modules

389

The tap method

The tap method (callable on any object) performs the somewhat odd but potentially

useful task of executing a code block, yielding the receiver to the block, and returning

the receiver. It’s easier to show this than to describe it:

>> "Hello".tap {|string| puts string.upcase }.reverse

HELLO

=> "olleH"

Called on the receiver “Hello”, the tap method yields that string back to its code

block, as confirmed by the printing out of the uppercased version of the string. Then,

tap returns the entire string—so the reverse operation is performed on the string. If

you call gsub! on a string inside a tap block, it doesn’t matter whether it returns nil,

because tap returns the string. Be careful, though. Using tap to circumvent the nil

return of gsub! (or of other similarly behaving bang methods) can introduce complexities of its own, especially if you do multiple chaining where some methods perform

in-place operations and others return object copies.

>>

=>

>>

=>

>>

=>

string = "Hello there!"

"Hello there!"

string.gsub!(/zzz/, "xxx")

nil

string.reverse!

"!ereht olleH"

Still, a number of Ruby users have been bitten by the nil return value, either because

they expected gsub! to behave like gsub (the non-bang version, which always returns

its receiver, whether there’s been a change or not) or because they didn’t anticipate a

case where the string wouldn’t change. So gsub! and its nil return value became a

popular candidate for change.

The change can be accomplished like this:

class String

alias __old_gsub_bang__ gsub!

def gsub!(*args, &block)

__old_gsub_bang__(*args, &block)

self

end

end

First, the original gsub! gets an alias; that will enable us to call the original version

from inside the new version. The new gsub! takes any number of arguments (the

arguments themselves don’t matter; we’ll pass them along to the old gsub!) and a

code block, which will be captured in the variable block. If no block is supplied—and

gsub! can be called with or without a block—block is nil.

Now, we call the old version of gsub!, passing it the arguments and reusing the

code block. Finally, the new gsub! does the thing it’s being written to do: it returns

self (the string), regardless of whether the call to __old_gsub_bang__ returned the

string or nil.

Licensed to sam kaplan

390

CHAPTER 13

Object individuation

And now, the reasons not to do this.

Changing gsub! this way is probably less likely, as a matter of statistics, to get you in

trouble than changing Regexp#match is. Still, it’s possible that someone might write

code that depends on the documented behavior of gsub!, in particular on the returning of nil when the string doesn’t change. Here’s an example—and although it’s contrived (as most examples of this scenario are bound to be), it’s valid Ruby and

dependent on the documented behavior of gsub!:

>> states = { "NY" => "New York", "NJ" => "New Jersey",

"ME" => "Maine" }

=> {"NY"=>"New York", "NJ"=>"New Jersey", "ME"=>"Maine"}

>> string = "Eastern states include NY, NJ, and ME."

=> "Eastern states include NY, NJ, and ME."

>> if string.gsub!(/\b([A-Z]{2})\b/) { states[$1] }

>> puts "Substitution occurred"

>> else

?> puts "String unchanged"

>> end

Substitution occurred

B

C

D

E

We start with a hash of state abbreviations and full names B. Then comes a string that

uses state abbreviations C. The goal is to replace the abbreviations with the full

names, using a gsub! operation that captures any two consecutive uppercase letters

surrounded by word boundaries (\b) and replaces them with the value from the hash

corresponding to the two-letter substring D. Along the way, we take note of whether

any such replacements are made. If any are, gsub returns the new version of string. If

no substitutions are made, gsub! returns nil. The result of the process is printed out

at the end E.

The damage here is relatively light, but the lesson is clear: don’t change the documented behavior of core Ruby methods. Here’s another version of the states-hash

example, using sub! rather than gsub!. In this version, failure to return nil when the

string doesn’t change triggers an infinite loop. Assuming we have the states hash and

the original version of string, we can do a one-at-a-time substitution where each substitution is reported:

>> while string.sub!(/\b([A-Z]{2})\b/) { states[$1] }

>> puts "Replacing #{$1} with #{states[$1]}..."

>> end

Replacing NY with New York...

Replacing NJ with New Jersey...

Replacing ME with Maine...

If string.sub! always returns a non-nil value (a string), then the while condition

will never fail, and the loop will execute forever.

What you should not do, then, is rewrite core methods so that they don’t do what

others expect them to do. There’s no exception to this. It’s something you should

never do, even though you can.

That leaves us with the question of how to change Ruby core functionality

safely. We’ll look at three techniques that you can consider: additive change, hook

Licensed to sam kaplan

Modifying Ruby’s core classes and modules

391

or pass-through change, and per-object change. Only one of them is truly safe,

although all three are safe enough to use in many circumstances.

Along the way, we’ll look at custom-made examples as well as some examples from

the ActiveSupport library. ActiveSupport provides good examples of the first two

kinds of core change: additive and pass-through. We’ll start with additive.

13.2.2 Additive changes

The most common category of changes to built-in Ruby classes is the additive change:

adding a method that doesn’t exist. The benefit of additive change is that it doesn’t

clobber existing Ruby methods. The danger inherent in it is that if two programmers

write added methods with the same name, and both get included into the interpreter

during execution of a particular library or program, one of the two will clobber the

other. There’s no way to reduce that risk to zero.

Added methods often serve the purpose of providing functionality that a large

number of people want. In other words, they’re not all written for specialized use in

one program. There’s safety in numbers: if people have been discussing a given

method for years, and if a de facto implementation of the method is floating around

the Ruby world, the chances are good that if you write the method or use an existing

implementation, you won’t collide with anything that someone else may have written.

Some of the methods you’ll see traded around on mailing lists and in blog posts

are perennial favorites.

SOME OLD STANDARDS: MAP_WITH_INDEX AND SINGLETON_CLASS

In chapter 10, you learned about enumerables, enumerators, and the with_index

method. In the days before with_index allowed indexes to be part of almost any enumerable iteration, we had only each_with_index; and people often asked that there

be added to the Enumerable module a map_with_index method, which would be similar to each_with_index (it would yield one element and one integer index number

on each iteration) but would return an array representing iterative executions of the

code block, as map does.

The method was never added, and it became a common practice for people to

write their own versions of it. A typical implementation might look like this:

class Array

def map_with_index

mapping = []

each_with_index do |e,i|

mapping << yield(e,i)

end

mapping

end

end

B

C

D

E

The method starts by creating an array in which it will accumulate the mapping of the

self-array B. Then, it iterates over the array using each_with_index C. Each time

through, it yields the current element and the current index and saves the result to

the accumulator array mapping D. Finally, it returns the mapping E.

Licensed to sam kaplan

392

CHAPTER 13

Object individuation

Here’s an example of map_with_index in action:

cardinals = %w{ first second third fourth fifth }

puts [1,2,3,4,5].map_with_index {|n,i|

"The #{cardinals[i]} number is #{n}."

}

The output is

The first number is 1.

The second number is 2.

# etc.

In Ruby 1.9 the map_with_index scenario is handled by map.with_index. But even 1.9

doesn’t have all the old favorite add-on methods. Another commonly implemented

method, and one which hasn’t been added to 1.9, is Object#singleton_class.

It’s not unusual to want to grab hold of an object’s singleton class in a variable.

Once you do so, it’s possible to manipulate it from the outside, so to speak, in ways

that go beyond what you can do by entering the class-definition context. To get an

object’s singleton class as an object, you need a way to evaluate that class at least long

enough to assign it to a variable. The technique for doing this depends on three facts

you already know.

First, it’s possible to get into a class-definition block for a singleton class:

str = "Hello"

class << str

# We're in str's singleton class!

end

Second, the actual value of any class-definition block is the value of the last expression

evaluated inside it. Third, the value of self inside a class-definition block is the class

object itself.

Putting all this together, we can write the singleton_class method as follows.

class Object

def singleton_class

class << self

self

end

end

end

All this method does is open the singleton class of whatever object is calling it, evaluate self, and close the definition block. Because self in a class-definition block is the

class, in this case it’s the given object’s singleton class. The result is that you can now

grab any object’s singleton class.

You’ll see this method in use later, but even now you can test it and see the effect of

having a singleton class available in a variable. Given the previous definition of

singleton_class, here’s a testbed for it:

class Person

end

B

Licensed to sam kaplan

Modifying Ruby’s core classes and modules

C

david = Person.new

def david.talk

puts "Hi"

end

393

D

dsc = david.singleton_class

E

if dsc.instance_methods.include?(:talk)

puts "Yes, we have a talk method!"

end

F

First, we create a Person test class B as well as an instance of it C. Next, we “teach”

the object a new method: talk D. (It doesn’t matter what the method is called or

what it does; its purpose is to illustrate the workings of the singleton_class method.)

Now, we grab the singleton class of the object and store it in a variable E. Once

we’ve done this, we can, among other things, query the class as to its methods. In the

example, the class is queried as to whether it has an instance method called talk F.

The output from the program is a resounding

Yes, we have a talk method!

The singleton_class method thus lets you capture a singleton class and address it

programmatically the way you might address any other class object. It’s a handy technique, and you’ll see definitions of this method (possibly with a different name) in

many Ruby libraries and programs.

Another way to add functionality to existing Ruby classes and modules is with a passive hooking or pass-through technique.

13.2.3 Pass-through overrides

A pass-through method change involves overriding an existing method in such a way

that the original version of the method ends up getting called along with the new version. The new version does whatever it needs to do and then passes its arguments

along to the original version of the method. It relies on the original method to provide a return value. (As you know from the match and gsub! override examples, calling the original version of a method isn’t enough if you’re going to change the basic

interface of the method by changing its return value.)

You can use pass-through overrides for a number of purposes, including logging

and debugging:

class String

alias __old_reverse__ reverse

def reverse

$stderr.puts "Reversing a string!"

__old_reverse__

end

end

puts "David".reverse

The output of this snippet is as follows:

Licensed to sam kaplan

394

CHAPTER 13

Object individuation

Reversing a string!

divaD

The first line is printed to STDOUT, and the second line is printed to STDERR. The example depends on creating an alias for the original reverse and then calling that alias at

the end of the new reverse.

Aliasing and its aliases

In addition to the alias keyword, Ruby has a method called alias_method, which is

a private instance method of Module. The upshot is that you can create an alias for

a method either like this:

class String

alias __old_reverse__ reverse

or like this:

class String

alias_method :__old_reverse__, :reverse

Because it’s a method and not a keyword, alias_method needs objects rather than

bare method names as its arguments. It can take symbols or strings. Note also that

the arguments to alias do not have a comma between them. Keywords get to do

things like that, but methods don’t.

Here’s another example: hooking into the Hash#[]= method so as to do something

with the key and value being added to the hash while not interfering with the basic

process of adding them to the hash:

B

require "yaml"

class Hash

alias __old_set__ []=

C

def []=(key, value)

__old_set__(key, value)

File.open("hash_contents", "w") do |f|

f.puts(self.to_yaml)

end

value

end

end

D

E

The idea here is to write the hash out to a file in YAML format every time a key is set

with []=. YAML, which stands for “YAML Ain’t a Markup Language,” is a specification

for a data-serialization format. In other words, the YAML standard describes a text format for the representation of data. The YAML library in Ruby (and many other languages also have YAML libraries; YAML is not Ruby-specific) has facilities for serializing

data into YAML strings and turning YAML strings into Ruby objects.

In order to intercept hash operations and save the hash in YAML format, we first

need to require the YAML extension B. Then, inside the Hash class, we create an alias

for the []= method C. Inside the new definition of []=, we start by calling the old

Licensed to sam kaplan

Modifying Ruby’s core classes and modules

395

version of []=, via the __old_set__ alias D. At the end of the method, we return the

assigned value (which is the normal behavior of the original []= method). In between

lies the writing to file of the YAML serialization of the hash E.

To try the program, save it to a file and add the following sample code at the

bottom:

states = {}

states["NJ"] = "New Jersey"

states["NY"] = "New Yorrk"

puts File.read("hash_contents")

puts

states["NY"] = "New York"

puts File.read("hash_contents")

If you run the file, you’ll see two YAML-ized hashes printed out. The first has the

wrong spelling of York; the second has the corrected spelling. What you’re seeing are

two YAML serializations. The pass-through alteration of Hash#[]= has allowed for the

recording of the hash in various states, as serialized by YAML.

It’s possible to write methods that combine the additive and pass-through philosophies. Some examples from ActiveSupport will demonstrate how to do this.

ADDITIVE/PASS-THROUGH HYBRIDS

An additive/pass-through hybrid is a method that has the same name as an existing core

method, calls the old version of the method (so it’s not an out-and-out replacement),

and adds something to the method’s interface. In other words, it’s an override that

offers a superset of the functionality of the original method.

The ActiveSupport library, which is part of the Rails web application development

framework and includes lots of additions to Ruby core classes, features a number of

additive/pass-through hybrid methods. A good example is the to_s method of the

Time class. Unchanged, Time#to_s provides a nice human-readable string representing the time:

>> Time.now.to_s

=> "2008-08-25 07:41:40 -0400"

ActiveSupport adds to the method so that it can take an argument indicating a specific kind of formatting. For example, you can format a Time object in a manner suit-

able for database insertion like this:

>> Time.now.to_s(:db)

=> "2008-08-25 07:46:25"

If you want the date represented as a number, ask for the :number format:

>> Time.now.to_s(:number)

=> "20080825074638"

The :rfc822 argument nets a time formatted in RFC822 style, the standard date format for dates in email headers. It’s similar to the Time#rfc822 method:

>> Time.now.to_s(:rfc822)

=> "Mon, 25 Aug 2008 07:46:41 -0400"

Licensed to sam kaplan

396

CHAPTER 13

Object individuation

The various formats added to Time#to_s work by using strftime, which wraps the system call of the same name and lets you format times in a large number of ways. So

there’s nothing in the modified Time#to_s that you couldn’t do yourself. The

optional argument is added for your convenience (and of course the database-friendly

:db format is of interest mainly if you’re using ActiveSupport in conjunction with an

object-relational library, such as ActiveRecord). The result is a superset of Time#to_s.

You can ignore the add-ons, and the method will work like it always did.

The kind of superset-driven override of core methods represented by ActiveSupport runs some risks: specifically, the risk of collision. Is it likely that you’ll end up

loading two libraries that both add an optional :db argument to Time#to_s? No; it’s

unlikely—but it’s possible. To some extent, a library like ActiveSupport is protected

by its high profile: if you load it, you’re probably familiar with what it does and will

know not to override the overrides. Still, it’s remotely possible that another library you

load might clash with ActiveSupport. As always, it’s difficult or impossible to reduce

the risk of collision to zero. You need to protect yourself by familiarizing yourself with

what every library does and by testing your code sufficiently.

The last major approach to overriding core Ruby behavior we’ll look at—and the

safest way to do it—is the addition of functionality on a strictly per-object basis, using

Object#extend.

13.2.4 Per-object changes with extend

Object#extend is a kind of homecoming in terms of topic flow. We’ve wandered to

the outer reaches of modifying core classes—and extend brings us back to the central

process at the heart of all such changes: changing the behavior of an individual

object. It also brings us back to an earlier topic from this chapter: the mixing of a

module into an object’s singleton class. That’s essentially what extend does.

ADDING TO AN OBJECT’S FUNCTIONALITY WITH EXTEND

Have another look at section 13.1.3 and in particular the Person example where we

mixed the Secretive module into the singleton classes of some Person objects. As a

reminder, the technique was this (where ruby is a Person instance):

class << ruby

include Secretive

end

Here’s how the Person example would look, using extend instead of explicitly opening up the singleton class of the ruby object. Let’s also use extend for david (instead

of the singleton method definition with def):

module Secretive

def name

"[not available]"

end

end

class Person

attr_accessor :name

end

Licensed to sam kaplan

Modifying Ruby’s core classes and modules

397

david = Person.new

david.name = "David"

matz = Person.new

matz.name = "Matz"

ruby = Person.new

ruby.name = "Ruby"

david.extend(Secretive)

ruby.extend(Secretive)

B

puts "We've got one person named #{matz.name}, " +

"one named #{david.name}, "

+

"and one named #{ruby.name}."

Most of this program is the same as the first version. The key difference is the use of

extend B, which has the effect of adding the Secretive module to the lookup paths

of the individual objects david and ruby by mixing it into their respective singleton

classes. That inclusion process happens when you extend a class object, too.

ADDING CLASS METHODS WITH EXTEND

If you write a singleton method on a class object, like so

class Car

def self.makes

%w{ Honda Ford Toyota Chevrolet Volvo }

end

end

or like so

class Car

class << self

def makes

%w{ Honda Ford Toyota Chevrolet Volvo }

end

end

end

or with any of the other notational variants available, you’re adding an instance

method to the singleton class of the class object. It follows that you can achieve this, in

addition to the other ways, by using extend:

module Makers

def makes

%w{ Honda Ford Toyota Chevrolet Volvo }

end

end

class Car

extend Makers

end

If it’s more appropriate in a given situation, you can extend the class object after it

already exists:

Car.extend(Makers)

Licensed to sam kaplan

Xem Thêm

2 Modifying Ruby’s core classes and modules

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về