How To Compose Methods: The Extract Method Refactoring
Posted on 28/6/07 by Tim Koschützki
When you have a code fragment that can be grouped together, turn the fragment into a method whose name explains the purpose of the method. Eliminate all conflicts with local variables as you go.
Motivation
Extract Method is one of the refactorings that you will do most often. It just happens too often that as a deadline comes near people tend to take their programming not as seriously as they should. The result? Godclasses and god methods, that try to do too much. If you manage to break classes down into smaller classes and methods into smaller methods you will be better off for several reasons: your code will read like a series of comments; you give other methods a better chance to use your code; you will make your code a lot more testable. Oh and ou will feel a lot more comfortable about your code.
Finally, you will also make your code a lot easier accessible for fellow coders and improve your overall reputation within your company. The problem with short methods is the naming. You really have to pay attention to your naming. With many short methods you will end up with really long names that may be longer than the code in their body. That's perfect, though - don't worry about this. If your method names speak for themselves you are on your way to success.
"So, how long should a method be?" I hear you asking. To me, length is not an issue. They key is the closeness of the method name and the method body. If extracting improves clarity then do it. Even do it, when the new name of the method is longer than its new body.
Mechanics
- Create a new method and name it after the new intention of it (name it by what it does and not by how it does it).
- Copy the code to extract into the new method.
- Check the extracted code for any variables that are local in scope to the source method.
- If any temporary variables are only used within your target method, make them local variables there.
- Make the local-variable checks: See if Inline Temp or Replace Temp With Query
make sense. If one variable is modified, you can most often treat the extracted code as a query and assign the result to the variable concerned. If there are several such variables, don't extract the method and try to encapsulate the temporary variables first. - Pass local-scope variables from the source method as parameters into the target method.
- Replace the extracted code in the source method with a call to the target method.
- Test.
Examples
No Local variables
In its simplest case, extract method can be done on the fly and is trivilially easy:
$output = < <<HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head profile="http://gmpg.org/xfn/11">
<title> PHP Coding Practices - Become an expert PHP Programmer</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="generator" content="WordPress 2.1.1" /> <!-- leave this for stats please -->
<meta name="verify-v1" content="beuH5Tc0J6zi6YCVt6RnxwIeqR/Zq1WhRwgi7I5hNZo=" />
<meta name="robots" content="index,follow" />
<meta name="language" content="en" />
<meta name="author" content="Tim Koschuetzki" />
<link rel="shortcut icon" href="favicon.ico" type="image/x-icon" />
<link rel="stylesheet" href="http://php-coding-practices.com/wp-content/themes/digg-3-col/style.css" type="text/css" media="screen" />
<link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="http://php-coding-practices.com/feed/" />
<link rel="alternate" type="text/xml" title="RSS .92" href="http://php-coding-practices.com/feed/rss/" />
<link rel="alternate" type="application/atom+xml" title="Atom 0.3" href="http://php-coding-practices.com/feed/atom/" />
<link rel="pingback" href="http://php-coding-practices.com/xmlrpc.php" />
<link rel='archives' title='June 2007' href='http://php-coding-practices.com/2007/06/' />
<link rel='archives' title='May 2007' href='http://php-coding-practices.com/2007/05/' />
<link rel='archives' title='April 2007' href='http://php-coding-practices.com/2007/04/' />
<link rel='archives' title='March 2007' href='http://php-coding-practices.com/2007/03/' />
<!-- META Tags added by Add-Meta-Tags WordPress plugin. Get it at: http://www.g-loaded.eu/ -->
<meta name="description" content="We provide easy-to-read, fun and practical articles about php best coding practices. Implement the tips given and you will become an expert php programmer." />
<meta name="keywords" content="php coding practices, best practices, php webdevelopment, coding guidelines, coding standards, coding principles, security, test driven development, unit tests, coding practice principle secure" />
HTML;
echo $output;
}
This function prints the header of http://php-coding-practices.com. We could as easily define it as:
$output = buildDoctype();
$output .= buildTitle();
$output .= buildRelations();
$output .= buildMetaTags();
echo $output;
}
function buildDoctype() {
$output = < <<HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head profile="http://gmpg.org/xfn/11">
HTML;
return $output;
}
function buildTitle() {
$output = < <<HTML
<title> PHP Coding Practices - Become an expert PHP Programmer
HTML;
return $output;
}
function buildRelations() {
$output = < <<HTML
<link rel="shortcut icon" href="favicon.ico" type="image/x-icon" />
<link rel="stylesheet" href="http://php-coding-practices.com/wp-content/themes/digg-3-col/style.css" type="text/css" media="screen" />
<link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="http://php-coding-practices.com/feed/" />
<link rel="alternate" type="text/xml" title="RSS .92" href="http://php-coding-practices.com/feed/rss/" />
<link rel="alternate" type="application/atom+xml" title="Atom 0.3" href="http://php-coding-practices.com/feed/atom/" />
<link rel="pingback" href="http://php-coding-practices.com/xmlrpc.php" />
<link rel='archives' title='June 2007' href='http://php-coding-practices.com/2007/06/' />
<link rel='archives' title='May 2007' href='http://php-coding-practices.com/2007/05/' />
<link rel='archives' title='April 2007' href='http://php-coding-practices.com/2007/04/' />
<link rel='archives' title='March 2007' href='http://php-coding-practices.com/2007/03/' />
HTML;
return $output;
}
function buildMetaTags() {
$output = < <<HTML
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="generator" content="WordPress 2.1.1" /> <!-- leave this for stats please -->
<meta name="verify-v1" content="beuH5Tc0J6zi6YCVt6RnxwIeqR/Zq1WhRwgi7I5hNZo=" />
<meta name="robots" content="index,follow" />
<meta name="language" content="en" />
<meta name="author" content="Tim Koschuetzki" />
<meta name="description" content="We provide easy-to-read, fun and practical articles about php best coding practices. Implement the tips given and you will become an expert php programmer." />
<meta name="keywords" content="php coding practices, best practices, php webdevelopment, coding guidelines, coding standards, coding principles, security, test driven development, unit tests, coding practice principle secure" />
HTML;
return $output;
}
Looks much cleaner already, no? Now if we wanted to use variables to build the site title or to load a css file based on user input, then that would be a lot easier than before.
Using Local Variables
The problem with local variables are parameters passed into the original method and temporaries built in the original method. Local variables are only in scope within the source method and they sometimes cause problems or prevent you from doing the refactoring altogether.
The easiest case is when the locals are only read, but not changed. In this case you can just pass them in as parameters:
$result = 0;
$i = 0;
foreach($array as $value) {
$i++;
if($i % 2 == 0) {
$result += $value;
}
}
echo "Some HTML Formatting for the result: {$result}"; // $result is the local in this case
}
I can extract the printing of the result into its own method:
$result = 0;
$i = 0;
foreach($array as $value) {
$i++;
if($i % 2 == 0) {
$result += $value;
}
}
printResult($result);
}
function printResult($result) {
echo "Some HTML Formatting for the result: {$result}";
}
Reassigning A Local Variables
It's the assignment to locale variables that becomes complicated. If you see an assignment to a parameter you should immediately remove it.
For temps that are assigned to, there are two cases. The simpler cases is that in which the variable is a temp used only in the extracted code. In this case you can remove it into the extracted code.
If the variable is not used after the code is extracted, you can make the change in just the extracted code. If it is used afterward, you need to make the extracted code return the changed value of the variable. Check the following code that illustrates this:
$result = 0;
$i = 0;
foreach($array as $value) {
$i++;
if($i % 2 == 0) {
$result += $value;
}
}
printResult($result);
}
function printResult($result) {
echo "Some HTML Formatting for the result: {$result}";
}
Now I extract the calculation:
$result = calculateSumOfAnyOtherArrayValue($array);
printResult($result);
}
function calculateSumOfAnyOtherArrayValue($array) {
$result = 0;
$i = 0;
foreach($array as $value) {
$i++;
if($i % 2 == 0) {
$result += $value;
}
}
return $result;
}
function printResult($result) {
echo "Some HTML Formatting for the result: {$result}";
}
The $result variable is used in both places - so I need to return it from calculateSumOfAnyOtherArrayValue(). Now we can rewrite our printSumOfAnyOtherArrayValue() method to:
printResult(calculateSumOfAnyOtherArrayValue($array));
}
If something more involved should happen to the $result variable, like an initial value depending on some class attribute, we would pass it to calculateSumOfAnyOtherArrayValue() as a parameter:
$result = $this->start;
$result = calculateSumOfAnyOtherArrayValue($array,$result);
printResult($result);
}
function calculateSumOfAnyOtherArrayValue($array, $initialValue) {
$result = $initialValue;
$i = 0;
foreach($array as $value) {
$i++;
if($i % 2 == 0) {
$result += $value;
}
}
return $result;
}
After some testing, we can refactor this and make it even better:
$result = calculateSumOfAnyOtherArrayValue($array,$this->start);
printResult($result);
}
Now this looks neat no? Now the calculation of the actual value can be very easily changed. We could just call another calculation method, or strategy, if you want to speak in terms of Design Patterns.
What Happens When More Than One Variable Needs To Be Returned?
Here you have several options. The way I prefer is to have a method for each return value. Usually you can arrange things for that and you will be fine. However, if you somehow can't do it and things look awkward, return an array of return values instead and live with it. Document it as some *nasty* stuff, though, and refactor it again with the next opportunity.