Home/Section index
 www.icosaedro.it 

 PHPLint Tutorial

This (not so) brief tutorial explains how to make your programs PHPLint-compliant. I'm firmly convinced that a good PHP source must be so simple to read that also a dumb program can understand it. The vice-versa also holds: once a program passed the validation of PHPLint (the dumb program in question) your source is good, and ready to work nicely. I hope these notes will help you grab the "philosophy" behind PHPLint and its motivations.

It might be useful to test the examples presented here using the on-line version of PHPLint.

Index

General structure of the program
The Type will be with you, always
Classes
Bad code
Generating the documentation

General structure of the program

Declare the required extension modules. There are several extensions modules, that may or may not be available in your installation of the PHP interpreter. You MUST specify ALL the extensions actually used by your program using the special meta-code statement require_module as in this example:

<?php
/*.
    require_module 'standard';
    require_module 'pcre';
    require_module 'mysql';
.*/
?>

The standard module is required by most applications, since it exports commonly used functions like strlen() and constants like PHP_OS. The page Modules lists all the available modules and allows to search for an item. Note the special comment /*. .*/ that marks a block of PHPLint meta-code; there is exactly a period after the first asterisk, and another period before the second asterisk. Such meta-code is ignored by the PHP interpreter, since it appears inside a comment.

PHPLint will complain with an error if it encounters an item (constant, variable, function or class) that cannot be found inside the specified modules. PHPLint will also complain if a module is required, but actually not used. Moreover, the required modules can also be inherited from packages imported via require_once. To sum-up, PHPLint will report into the documentation it generates which modules and packages your package actually needs. The list of modules and packages required is really useful when the package has to be deployed on the target system, typically a WEB server.

Use require_once to include other packages. The alternatives require include and include_once are not reliable ways to include code. include is commonly used to include snippets of HTML code, as often is required to build the header and the footer of a WEB page, for example, but PHPLint does not parse recursively these files.

Declarations first. Before the code be actually executed, the PHP interpreter scans all the code looking for functions, classes and methods, so that they can appear in any order. By the contrary, PHPLint is a single-pass parser, so it needs to read the definitions before the usage. PHPLint raises an error if a function or a class gets used before being defined. Take care to sort your program in a bottom-up order: low-level functions, classes and methods first, high-level ones next.

The reference manual describes a feature of PHPLint meta-code named "prototypes declarations" or even "forward declarations", see chapter Recursive declarations. Prototypes can be also used to relax the strict bottom-up order of the declarations. The use of this solution should be avoided, and prototypes should be restricted to those cases in which the intrinsic recursive nature of the declarations requires them (example: function A calls B, and function B calls A).

The Type will be with you, always

In PHP a variable can hold any type of value. So, for example, you can assign to a variable $foo a number, then later you can assign to the same variable a string. A given function may return several types of values depending on the arguments that are passed. Programs that use this "polymorphic" behavior are difficult to read and difficult to understand, and often they require the interpreter to do some expensive conversion at run-time with performance penalties.

In PHPLint things goes in a completely different way. Every variable, every function argument, and every value returned by a function must have a well defined type. PHPLint does not require the type of every thing be explicitly defined, since in most of the cases this type can be guessed from the code itself. For example, assigning 123 to a variable clearly makes that variable of the type int, and returning a string from a function clearly makes this function of the type string. Nevertheless, there are cases where a type must be explicitly indicated, as we will see in the examples below.

Always initialize all the variables. Some PHP programmers rely on the fact that an un-initialized variable behave like the zero number or the empty string, depending on the context where this variable appears. That's not a good programming style. Every variable must have a value (and then, a type) before being used.

The type of a variable cannot change. If a variable was determined to belong to some type, it must be used according to its type. You cannot assign a string to a variable initialized by a number:

$i = 123;
$i = "hello";  # <== ERROR

The superglobal arrays $_GET $_POST and $_REQUEST are array[string]mixed. In PHPLint every array has a type for the index and a type for its elements. The index can be int or string, while the elements must be all of the same type. The super-global arrays $_GET & Co. all have an index of the type string, while their elements are mixed. Typically these elements are strings, but in some cases they can be also arrays of strings, or array of arrays of strings, and so on. Your program must ensure the values gathered from these array be of the expected type. To ensure that, a value type-cast is needed at run-time:

# Example: acquiring the login mask:
$name = (string) $_POST["name"];
$pass = (string) $_POST["pass"];


# Example: acquiring optional URL parameter:
if( isset($_GET["chapter"]) )
    $chapter = (int) $_GET["chapter"];
else
    $chapter = 0; # intro

By the way, these value type-cast applied to a mixed value make PHPLint aware of the actual type expected by the program. So, $name and $pass are strings, while $chapter is an integer number.

Declare the type of each function argument. PHPLint implements a strong-typed language where every expression must have a well defined type. If the argument of a function does not has a type, PHPLint raises a warning message and sets the type of that argument to the generic mixed type. Practically, nothing can be done with such a type, so be prepared to see an error displayed every time that variable will be used. The type of an argument can be declared using specific PHPLint meta-code. For example, a function like this one

<?php

function get_param($name, $max_len=20)
{
    if( ! isset( $_REQUEST[$name] ) )
        return NULL;
    $s = $_REQUEST[$name];
    if( strlen($s) > $max_len )
        $s = substr($s, $max_len);
    return $s;
}

?>
can be converted to the PHPLint-compliant version that follows:
<?php

/*. require_module 'standard'; .*/

/*. string .*/ function get_param(/*. string .*/ $name, $max_len=20)
{
    if( ! isset( $_REQUEST[$name] ) )
        return NULL;
    $s = (string) $_REQUEST[$name];
    if( strlen($s) > $max_len )
        $s = substr($s, $max_len);
    return $s;
}

?>

Note that we declared also the return type of the function. Note too that the argument $max_len does not require a type, since its initial value already provides to PHPLint the correct answer: int.

NULL should always have a formal type-cast. The null type is for internal use only of PHPLint. It has only a value: NULL. The same value can be used by variables of type string, array, resource and object. PHPLint needs a way to understand to which of these types the NULL constant actually belongs. In the example above PHPLint guesses that, since the returned value must be a string, the NULL value that appears inside the return statement must be string. As a general rule, you should not rely on these guesses, and you should provide an explicit formal type-cast:

return /*. (string) .*/ NULL;

Note that, apart the /*. .*/ symbols, this formal type-cast is similar to a PHP value type-cast, where the type name is enclosed between round parenthesis.

Use void for functions that do not return a value. PHPLint always try to guess the returned type from the return EXPR; statement: the type resulting from the evaluation of the EXPR is the type of the function. Functions containing only return; are void. As a general rule, it is better to always declare explicitly the returned type, since this make the interface to the function more readable to the programmer.

Use /*. args .*/ for functions accepting a variable number of arguments. Examples:

function f(/*. args .*/){}
function g($a, $b /*. , args .*/){}

The first function can be called with any number of arguments, while the latter requires at least two arguments. Note the presence of the comma inside meta-code of the second function.

Classes

All the properties of a class MUST be declared. Moreover, assign to them a type and/or an initial value. As you might guess at this point, providing an initial value lets PHPLint to determinate its type. Examples:


PHP 4
class Test {
    /*. public .*/ var $num = 123;
    /*. public .*/ var $arr = array(1, 2, 3);
    /*. private .*/ var /*. array[int]string .*/ $arr2;

    function Test($first = "")
    { $this->arr2 = array($first); }
}

PHP 5
class Test {
    public $num = 123;
    public $arr = array(1, 2, 3);
    private /*. array[int]string .*/ $arr2;

    function __construct($first = "")
    { $this->arr2 = array($first); }
}

Note that the array $arr2 lacks its initial value, so an explicit declaration of type is required. Remember that in this case the PHP interpreter assign NULL as initial value.

Properties cannot be added dynamically at run-time to an object. If you need to store a variable number of data inside an object, use a property of the type array.

Bad code

Constants should be... constant! PHPLint expects the expression giving the value of a constant be statically determinable. In any other case a variable is more appropriate. Moreover, some programmers take advantage from the fact that constants "lives" in the global namespace, so you can get their value simply writing their name:

# WRONG CODE:
define("MY_IMGS", $_SERVER['DOCUMENT_ROOT'] . "/imgs");
if ( PHP_OS == 'WINNT' )
    define("ROOT", "C:\\");
else
    define("ROOT", "/");

function f()
{
    echo "MY_IMGS=", MY_IMGS, " ROOT=", ROOT;
}

You should try to submit the code above to PHPLint: it will complain that the MY_IMGS cannot be statically evaluated, and ROOT is re-defined. Since these values are determined at run-time, you should use two variables instead:

# Right code:
$my_imgs = $_SERVER['DOCUMENT_ROOT'] . "/imgs";
if ( PHP_OS === 'WINNT' )
    $root = "C:\\";
else
    $root = "/";

function f()
{
    echo "my_imgs=", $GLOBALS['my_imgs'], " root=", $GLOBALS['root'];
}

Write appropriate boolean expressions. Statements like if(EXPR) while(EXPR) do{}while(EXPR) all require a proper boolean expression. The empty string, the empty array and the 0 value are not equivalent to FALSE. Some functions of the standard library, that normally return a resource, may return FALSE to indicate an error: these special returned values must be checked with the === or the !== operators. Example:


WRONG CODE Right code
The ! operator cannot be applied
to a value of the type resource:
if( ! $f = fopen("myFile.txt", "r") ){
    die("error opening the file");
}
if( ($f = fopen("myFile.txt", "r")) === FALSE ){
    die("error opening the file");
}
or even better:
$f = fopen("myFile.txt", "r");
if( $f === FALSE ){
    die("error opening the file");
}

Functions must always return only one type of value. Don't write functions that "return the result on success or FALSE on failure" because mixing types that are different prevent PHPLint from doing its job and make the code harder to read and to debug. Here there is a list of possible alternatives:

Do not mix elements of different types in arrays. For example, this table mixes strings, numbers and boolean values:

# WRONG:
$people = array(
#   Name    Age   Married
    "Gery",  34,   FALSE,
    "Sara",  23,   TRUE,
    "John",  56,   TRUE);

echo "Married persons younger than 30: ";
for($i = 0; $i < count($people); $i += 3)
    if( $people[$i+2] and $people[$i+1] < 30 )
        echo $people[$i], " ";

PHPLint cannot parse effectively such a code, and neither humans can understand it very well. The solution to the problem requires to introduce a class Person where all the data about a person are stored. The resulting code might look similar to this one, that can be validated by PHPLint:

# Right:
class Person {
    public /*. string .*/ $name;
    public /*. int    .*/ $age;
    public /*. bool   .*/ $married;

    function __construct(/*. string .*/ $name,
        /*. int .*/ $age, /*. bool .*/ $married)
    {
        $this->name = $name;
        $this->age  = $age;
        $this->married = $married;
    }
}

$people = array(
    new Person("Gery",  34,   FALSE),
    new Person("Sara",  23,   TRUE),
    new Person("John",  56,   FALSE)
);

echo "Married persons younger than 30: ";
for($i = 0; $i < count($people); $i++)
    if( $people[$i]->married and $people[$i]->age < 30 )
        echo $people[$i]->name, " ";

Ok, I agree: this second version of the same program is longer, but the first one remembers to me the old times of the BASIC when the arrays were the only data structure available. Moreover, trying the first example while writing this document, I made a mistake with the offset of the index and the program did not work properly; the second version, instead, worked perfectly at the first run.

Proper use of ini_get(). Sometimes programs need to check at run-time their configuration file php.ini for some parameter. All the parameters declared here are available through the function ini_get($param) where $param is the name of the parameter. The value returned by this function is always a string or the NULL value. For those parameters that are simple flags, the value returned is the empty string "" or "0" for FALSE/No/Off, and "1" for TRUE/Yes/On. The other parameters return a string, although they can be actually numbers. The right way to handle this in PHPLint is shown in the following examples, that may be useful in actual applications:


if( ini_get("magic_quotes_gpc") === "1"
or  ini_get("magic_quotes_runtime") === "1")
    exit("ERROR: please disable magic quotes in php.ini");

if( ini_get("file_uploads") !== "1" )
    exit("ERROR: please enable file upload in php.ini");


/*. int .*/ function return_bytes(/*. string .*/ $s)
/*.
    DOC Converts a numeric value from the php.ini, possibly
    containing some scale factors as K, M and G.
    Example taken from the PHP manual. .*/
{
    $v = (int) $s;
    $last = strtolower($s[strlen($s)-1]);
    switch($last) {
        // The 'G' modifier is available since PHP 5.1.0
        case 'g': $v *= 1024;  /*. missing_break; .*/
        case 'm': $v *= 1024;  /*. missing_break; .*/
        case 'k': $v *= 1024;  /*. missing_break; .*/
        /*. missing_default: .*/
    }
    return $v;
}

$upload_max_filesize =
    return_bytes( trim( ini_get("upload_max_filesize" ) ) );
$post_max_size =
    return_bytes( trim( ini_get("post_max_size" ) ) );
$max_upload = min($upload_max_filesize, $post_max_size);
echo "Max uploadable file size is $max_upload bytes.";

Do not use each() and list() to assign a list of variables. PHP allows the special syntax list($x,$y)=EXPR; where EXPR is an expression generating an array, typically the value returned from a function or the special language construct each(). Never use these syntaxes, because PHPLint cannot determinate the types of the values $x and $y. Rather, assign to an array, then use the resulting elements.


WRONG CODE Right code
$a = array(1, 2, 3);

reset($a);
while( list($k, $v) = each($a) ){
    echo $k, $v;
}
$a = array(1, 2, 3);

foreach( $a as $k => $v ){
    echo $k, $v;
}

For example, this function may be useful to measure with precision the time elapsed:

function elapsed($a)
{
    $b = microtime();
    list($b_dec, $b_sec) = explode(" ", $b);
    list($a_dec, $a_sec) = explode(" ", $a);
    return ((float)$b_sec - (float)$a_sec)
        + ((float)$b_dec - (float)$a_dec);
}

$start = (string) microtime();
/**** here do something time-consuming ****/
$e = elapsed($start);
if( $e > 1.0 )  echo "WARNING: elapsed time $e s";

Note the presence of two list() constructs. That code can be easily converted to the following PHPLint-compliant code, where the result of the explode() function is assigned to two arrays; some meta-code was also added:

/*.float.*/ function elapsed(/*.string.*/ $start)
{
    $a = explode(" ", $start);
    $b = explode(" ", (string) microtime());
    return ((float)$b[1] - (float)$a[1])
        + ((float)$b[0] - (float)$a[0]);
}

String comparisons should be made using strcmp(). Never use the weak comparison operators < <= == != >= > with strings, because they are unreliable. Apply this simple conversion rule:

$a OP $b     ==>    strcmp($a, $b) OP 0

where OP is the comparison operator. Use === and !== for strict equality/inequality.

die() is a statement, not a function! This syntax is invalid:

$f = fopen(...) or die(...);

because die() does not return a boolean value (actually, it does not return anythingat all). Use the longer form we shown above. The same holds for exit(), actually a synonym of die().

Do not use "variable name" classes, for example

$obj = new $class();

because $class might be any string, without any relation with the known classes; this source is difficult to understand for the human reader of the source, and impossible to check at all for PHPLint. Consider to use an abstract class instead (see examples inside the manual, ch. devoted to PHP 4 classes and PHP 5 classes). PHP 5 also introduced the interfaces, intended just to address elegantly these problems. Adding these "glue-classes" makes the code more readable and PHPLint helps to keep the complexity under control.

Returning to the example above, if $obj has to be an instance of some class dynamically determinated at run-time, certaynly these classes are in some way related, i.e. them exibit the same interface. This interface (i.e. a common set of constants, properties and methods) will be used in the following code. Two classes that share the same interface must have a common ancestor, that may be an abstract class or an interface. The example below illustrates this scheme:

    interface Ancestor {
        function doThis();
        function doThat();
    }

    class ConcreteClass1 implements Ancestor {
        public function doThis() { /* ...implementation... */ }
        public function doThat() { /* ...implementation... */ }
    }

    class ConcreteClass2 implements Ancestor {
        public function doThis() { /* ...implementation... */ }
        public function doThat() { /* ...implementation... */ }
    }

    # Declare the variable $obj to be a generic Ancestor.
    # This says to PHPLint that $obj is an object that
    # implements "Ancestor":
    $obj = /*. (Ancestor) .*/ NULL;

    if( we_need_ConcreteClass1 )
        $obj = new ConcreteClass1();
    else /* we need ConcreteClass2 instead */
        $obj = new ConcreteClass2();

    # Now we can use $obj according to the interface as specified
    # for Ancestor, whichever its actual implementation may be:
    $obj->doThis();
    $obj->doThat();

    # The same strategy can be used also inside the functions:
    function doThisAndThat(/*. Ancestor .*/ $obj)
    {
        $obj->doThis();
        $obj->doThat();
    }

    doThisAndThat($obj);

The advantage of using abstract classes and interfaces is that the PHP interpreter, the PHPLint validator and humans reading the source can undertand the meaning of the source and detect possible violations of the "contract" rules of the extended and implemented classes.

Generating the documentation

PHPLint has its own documentation system (PHPLint Documentator) but it supports also the phpDocumentor system (PHPLint support for phpDocumentor). The following example compares the two systems:


<?php
/**
 * Testing phpDocumentor and PHPLint Documentator
 *
 * @package TstPhpDoc
 * @author Umberto Salsi <phplint@icosaedro.it>
 */

/*. require_module 'standard'; .*/


/**
 * Returns the number nearest to zero
 *
 * Here we use the phpDocumentor system.
 * PHPLint gathers the short description above, this long
 * description, and the following declarations that gives
 * the signature of the function.
 *
 * @param int $a   The first number...
 * @param int $b   ...and this is the second
 * @return int
 */
function nearest_zero_1($a, $b)
{
    if( abs($a) <= abs($b) )
        return $a;
    else
        return $b;
}


/*. int .*/ function nearest_zero_2(/*. int .*/ $a, /*. int .*/ $b)
/*.
    DOC Returns the number nearest to zero
    
    Using the PHPLint Documentator, the signature of the
    function is declared using the PHPLint meta-code.
.*/
{
    if( abs($a) <= abs($b) )
        return $a;
    else
        return $b;
}

?>

PHPLint can guess the type of any expression and the structure of an array from their usage, but there are two exceptions: the empty array() constructor and the NULL value. In these cases neither the PHP language nor the phpDocumentor helps, and a PHPLint formal typecast is required. In the following example, $names is a list of strings with integer index initially empty, and $last_exception is an object of the class Exception initially not instantiated:

/**
 * List of names
 */
$names = /*. (array[int]string) .*/ array();


$last_exception = /*. (Exception) .*/ NULL;
/*.
    DOC  Last <@item Exception> occurred

    Note that $last_exception can hold any object instanciated from the
    Exception class or from any of its derived sub-classes, for example
    <@item ErrorException>.
.*/

Umberto Salsi

Contact
Site map
Home/Section index