debuggable

 
Contact Us
 

How To Compose Methods: The Extract Method Refactoring

Posted on 28/6/07 by Tim Koschützki

When you have a code fragment that can be grouped together, turn the fragment into a method whose name explains the purpose of the method. Eliminate all conflicts with local variables as you go.

Motivation

Extract Method is one of the refactorings that you will do most often. It just happens too often that as a deadline comes near people tend to take their programming not as seriously as they should. The result? Godclasses and god methods, that try to do too much. If you manage to break classes down into smaller classes and methods into smaller methods you will be better off for several reasons: your code will read like a series of comments; you give other methods a better chance to use your code; you will make your code a lot more testable. Oh and ou will feel a lot more comfortable about your code.

Finally, you will also make your code a lot easier accessible for fellow coders and improve your overall reputation within your company. The problem with short methods is the naming. You really have to pay attention to your naming. With many short methods you will end up with really long names that may be longer than the code in their body. That's perfect, though - don't worry about this. If your method names speak for themselves you are on your way to success.

"So, how long should a method be?" I hear you asking. To me, length is not an issue. They key is the closeness of the method name and the method body. If extracting improves clarity then do it. Even do it, when the new name of the method is longer than its new body.

Mechanics

  1. Create a new method and name it after the new intention of it (name it by what it does and not by how it does it).
  2. Copy the code to extract into the new method.
  3. Check the extracted code for any variables that are local in scope to the source method.
  4. If any temporary variables are only used within your target method, make them local variables there.
  5. Make the local-variable checks: See if Inline Temp or Replace Temp With Query
    make sense. If one variable is modified, you can most often treat the extracted code as a query and assign the result to the variable concerned. If there are several such variables, don't extract the method and try to encapsulate the temporary variables first.
  6. Pass local-scope variables from the source method as parameters into the target method.
  7. Replace the extracted code in the source method with a call to the target method.
  8. Test.

Examples

No Local variables

In its simplest case, extract method can be done on the fly and is trivilially easy:

function printHeader {
  $output = < <<HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head profile="http://gmpg.org/xfn/11">

  <title> PHP Coding Practices - Become an expert PHP Programmer</title>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 
  <meta name="generator" content="WordPress 2.1.1" /> <!-- leave this for stats please -->
        <meta name="verify-v1" content="beuH5Tc0J6zi6YCVt6RnxwIeqR/Zq1WhRwgi7I5hNZo=" />
  <meta name="robots" content="index,follow" />
<meta name="language" content="en" />
<meta name="author" content="Tim Koschuetzki" />

<link rel="shortcut icon" href="favicon.ico" type="image/x-icon" />
<link rel="stylesheet" href="http://php-coding-practices.com/wp-content/themes/digg-3-col/style.css" type="text/css" media="screen" />
  <link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="http://php-coding-practices.com/feed/" />
  <link rel="alternate" type="text/xml" title="RSS .92" href="http://php-coding-practices.com/feed/rss/" />
  <link rel="alternate" type="application/atom+xml" title="Atom 0.3" href="http://php-coding-practices.com/feed/atom/" />
  <link rel="pingback" href="http://php-coding-practices.com/xmlrpc.php" />

    <link rel='archives' title='June 2007' href='http://php-coding-practices.com/2007/06/' />
  <link rel='archives' title='May 2007' href='http://php-coding-practices.com/2007/05/' />
  <link rel='archives' title='April 2007' href='http://php-coding-practices.com/2007/04/' />

  <link rel='archives' title='March 2007' href='http://php-coding-practices.com/2007/03/' />
   
<!-- META Tags added by Add-Meta-Tags WordPress plugin. Get it at: http://www.g-loaded.eu/ -->
<meta name="description" content="We provide easy-to-read, fun and practical articles about php best coding practices. Implement the tips given and you will become an expert php programmer." />
<meta name="keywords" content="php coding practices, best practices, php webdevelopment, coding guidelines, coding standards, coding principles, security, test driven development, unit tests, coding practice principle secure" />
HTML;
  echo $output;
}

This function prints the header of http://php-coding-practices.com. We could as easily define it as:

function printHeader {
  $output = buildDoctype();
  $output .= buildTitle();
  $output .= buildRelations();
  $output .= buildMetaTags();
 
  echo $output;
}

function buildDoctype() {
  $output = < <<HTML
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head profile="http://gmpg.org/xfn/11">
HTML;
  return $output;
}

function buildTitle() {
  $output = < <<HTML
    <title> PHP Coding Practices - Become an expert PHP Programmer
HTML;
  return $output;
}

function buildRelations() {
  $output = < <<HTML
    <link rel="shortcut icon" href="favicon.ico" type="image/x-icon" />
    <link rel="stylesheet" href="http://php-coding-practices.com/wp-content/themes/digg-3-col/style.css" type="text/css" media="screen" />
    <link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="http://php-coding-practices.com/feed/" />
    <link rel="alternate" type="text/xml" title="RSS .92" href="http://php-coding-practices.com/feed/rss/" />
    <link rel="alternate" type="application/atom+xml" title="Atom 0.3" href="http://php-coding-practices.com/feed/atom/" />
    <link rel="pingback" href="http://php-coding-practices.com/xmlrpc.php" />
 
      <link rel='archives' title='June 2007' href='http://php-coding-practices.com/2007/06/' />
    <link rel='archives' title='May 2007' href='http://php-coding-practices.com/2007/05/' />
    <link rel='archives' title='April 2007' href='http://php-coding-practices.com/2007/04/' />
 
    <link rel='archives' title='March 2007' href='http://php-coding-practices.com/2007/03/' />
 
HTML;
  return $output;
}

function buildMetaTags() {
  $output = < <<HTML
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 
    <meta name="generator" content="WordPress 2.1.1" /> <!-- leave this for stats please -->
          <meta name="verify-v1" content="beuH5Tc0J6zi6YCVt6RnxwIeqR/Zq1WhRwgi7I5hNZo=" />
    <meta name="robots" content="index,follow" />
    <meta name="language" content="en" />
    <meta name="author" content="Tim Koschuetzki" />
    <meta name="description" content="We provide easy-to-read, fun and practical articles about php best coding practices. Implement the tips given and you will become an expert php programmer." />
    <meta name="keywords" content="php coding practices, best practices, php webdevelopment, coding guidelines, coding standards, coding principles, security, test driven development, unit tests, coding practice principle secure" />
HTML;
  return $output;
}

Looks much cleaner already, no? Now if we wanted to use variables to build the site title or to load a css file based on user input, then that would be a lot easier than before.

Using Local Variables

The problem with local variables are parameters passed into the original method and temporaries built in the original method. Local variables are only in scope within the source method and they sometimes cause problems or prevent you from doing the refactoring altogether.

The easiest case is when the locals are only read, but not changed. In this case you can just pass them in as parameters:

function printSumOfAnyOtherArrayValue($array) {
  $result = 0;
  $i = 0;
  foreach($array as $value) {
    $i++;
    if($i % 2 == 0) {
      $result += $value;
    }
  }
 
  echo "Some HTML Formatting for the result: {$result}"; // $result is the local in this case
}

I can extract the printing of the result into its own method:

function printSumOfAnyOtherArrayValue($array) {
  $result = 0;
  $i = 0;
  foreach($array as $value) {
    $i++;
    if($i % 2 == 0) {
      $result += $value;
    }
  }
 
  printResult($result);
}

function printResult($result) {
  echo "Some HTML Formatting for the result: {$result}";
}

Reassigning A Local Variables

It's the assignment to locale variables that becomes complicated. If you see an assignment to a parameter you should immediately remove it.

For temps that are assigned to, there are two cases. The simpler cases is that in which the variable is a temp used only in the extracted code. In this case you can remove it into the extracted code.

If the variable is not used after the code is extracted, you can make the change in just the extracted code. If it is used afterward, you need to make the extracted code return the changed value of the variable. Check the following code that illustrates this:

function printSumOfAnyOtherArrayValue($array) {
  $result = 0;
  $i = 0;
  foreach($array as $value) {
    $i++;
    if($i % 2 == 0) {
      $result += $value;
    }
  }
 
  printResult($result);
}

function printResult($result) {
  echo "Some HTML Formatting for the result: {$result}";
}

Now I extract the calculation:

function printSumOfAnyOtherArrayValue($array) {
  $result = calculateSumOfAnyOtherArrayValue($array);
  printResult($result);
}
function calculateSumOfAnyOtherArrayValue($array) {
  $result = 0;
  $i = 0;
  foreach($array as $value) {
    $i++;
    if($i % 2 == 0) {
      $result += $value;
    }
  }
  return $result;
}
 
function printResult($result) {
  echo "Some HTML Formatting for the result: {$result}";
}

The $result variable is used in both places - so I need to return it from calculateSumOfAnyOtherArrayValue(). Now we can rewrite our printSumOfAnyOtherArrayValue() method to:

function printSumOfAnyOtherArrayValue($array) {
  printResult(calculateSumOfAnyOtherArrayValue($array));
}

If something more involved should happen to the $result variable, like an initial value depending on some class attribute, we would pass it to calculateSumOfAnyOtherArrayValue() as a parameter:

function printSumOfAnyOtherArrayValue($array) {
  $result = $this->start;
  $result = calculateSumOfAnyOtherArrayValue($array,$result);
  printResult($result);
}
function calculateSumOfAnyOtherArrayValue($array, $initialValue) {
  $result = $initialValue;
  $i = 0;
  foreach($array as $value) {
    $i++;
    if($i % 2 == 0) {
      $result += $value;
    }
  }
  return $result;
}

After some testing, we can refactor this and make it even better:

function printSumOfAnyOtherArrayValue($array) {
  $result = calculateSumOfAnyOtherArrayValue($array,$this->start);
  printResult($result);
}

Now this looks neat no? Now the calculation of the actual value can be very easily changed. We could just call another calculation method, or strategy, if you want to speak in terms of Design Patterns.

What Happens When More Than One Variable Needs To Be Returned?

Here you have several options. The way I prefer is to have a method for each return value. Usually you can arrange things for that and you will be fine. However, if you somehow can't do it and things look awkward, return an array of return values instead and live with it. Document it as some *nasty* stuff, though, and refactor it again with the next opportunity.

 
&nsbp;

You can skip to the end and add a comment.

[...] you ask me you should prefer Extract Method if you can, because more smaller methods are usually more readable and (re-)usable. However, there [...]

[...] If you are on PHP5 or PHP4 (using the by-reference operator), look in the calling method to see if the parameter is used again afterward and try to understand what is changed where in your code and document that. Make sure you only return one value from the method. If more than one call-by-reference parameter is assigned to and returned, try to use a data clump or Extract Method. [...]

Winer Real said on Feb 24, 2009:

I have lot of trouble forcing cote to move from php4 to php5 is hard to find this simple solution thanks.

Tarique Sani said on Feb 25, 2009:

Tim this is one of your best writings in recent past /me applauds

This post is too old. We do not allow comments here anymore in order to fight spam. If you have real feedback or questions for the post, please contact us.