Hungarian Notation

Thursday, February 25th, 2010 | Uncategorized | No Comments

I have a friend who, when he first started programming, would name his variables and functions after super heroes and cartoon characters. A function that checked the status of a flag might look like this:

goFindSuperMan(catwoman);

This made debugging his code a nightmare. Without knowing it, he had obstuficated the code completely, while still making it interesting to read.

Despite a few exceptions like this, most programmers these days use reasonably intelligent naming conventions for their functions and variables. For instance, a function used to check the status of a flag will normally look somethig like this:

checkFlag(flagName);

However, even with good naming conventions, one of the problems with reading other people’s code is that often there is no way to know for sure what type of data a variable stores or if it has a special status.

The problems

Variable and Function Types

This can be a problem when you accidentally set an integer to a string, as would be possible in a function that requires an integer input for the day of the week, such as 4. Such a variable could be mistaken for a string input for the day of the week, such as “Wednesday”.

Variable Visibility

Also, often it is useful to know if a variable has a special status, such as being private or static, since that can affect the programming of implementations, child classes, or even the business logic of the application.

Object Oriented Languages

Many people argue that object oriented programming languages such as Java, C#, or PHP don’t need to follow these rules, for two reasons:

  1. The variable name makes it clear what the data type is. For instance, if an application has a class MyClass, then the variable representing it would be called MyClass, as in
    MyClasses = array(new MyClass(), new MyClass());
  2. If the data type is not correctly set, the application can throw an Exception.

Although these are true, there are still three problems:

  1. there is no way to know if the variable has a special status, such as public, private, or static.
  2. Most object oriented programming languages still support primitive variable types such as int and float.
  3. So many programmers use such bad or mixed conventions when developing code, particularly with languages that are picked up by non-programmers, such as PHP, ECMAScript, and ActionScript.

The Solution

There is a simple solution to these problem known as Hungarian Notation, which not only improves the code’s readability by other’s, but also builds a sort of visual debugging mechanism into the code. For those of you who don’t know, Hungarian notation is a programming naming convention that semantically encodes the type of a variable in its own name. For example the integer dayOfTheWeek would be called iDayOfTheWeek, effectively removing the ambiguity mentioned above.

I program mostly in PHP, so I’m going to offer a PHP solution. Take from it what you will, because each language has slightly different syntax, primitives, variable statuses and visibilities.

Variable Types

PHP supports many variable primitives, including:

  • four scalars
    • boolean
    • integer
    • float
    • string
  • Two Compound:
    • array
    • object
  • Two Special:
    • resource
    • null
  • And three pseudo-types
    • mixed
    • number
    • callback

Here is the Hungarian Notation I recommend for PHP best coding practices:

  • for all of the prefixes, begin with a lower case for the prefix, and start with an uppercase or an underscore for the variable name such as:
    $bBoolean = true;
    $b_boolean = true;
  • These apply to functions as well:
    function bCheckFlag($sFlagName) {
    return true;
    }
  • boolean
    begin with a ‘b‘ as in ‘boolean‘ or an ‘f‘ as in ‘flag

    $bOption = true;
  • Integer
    begin with an ‘i

    $iCounter = 3;
  • float
    begin with ‘fp‘ for ‘floating point’, ‘d‘ for ‘double’, or ‘l‘ for ‘long’.

    $fp = rand(0,1);
  • string
    begin with an ‘s‘ or a ‘c‘ as in ‘char

    $sDayOfTheWeek = 'Wednesday';
  • arrays
    Pluralize arrays and/or prefix them with ‘a‘ or ‘arr

    $bOptions = array('true','false');
    $arrOptions = array('true','false');
    $MyClasses = array(new MyClass(), new MyClass());

    Note: pluralizing is better, since then the variable type is obvious. However, I have something else in mind with arr that I will show you in a later post.

  • objects
    Capitalize
    the first letter, and try to use the class name in the variable, or prefix with ‘obj‘.

    $MyClass = new MyClass();
    $MyClass_temp = new MyClass();

    Note: The reason I suggest prefixing with ‘obj’ is that, when creating a function, it may be a little excessive to write a function called MemberGetMember(). objGetMember() may be better, while still making it clear that there is an object returned.

  • resourses
    begin with an ‘r

    $rDatabase = mysql_connect(...);
  • null
    prefix with ‘v‘ for ‘void‘ Save ‘n’ for ‘numbers’

    public function vSetName($sValue) {
    $this->name = $sValue;
    }
  • mixed
    Often in a loosly typed language like PHP, the variable type doesn’t matter. In that case, either prefix with ‘m‘ or don’t prefix at all

    $variable = 32.4;
    $mVariable = true;
  • number
    In a loosly typed language like PHP, numbers can be represented by integers, floats, or even strings. If some sort of number is required, prefix with ‘n

    $nAge = '32';
  • callback and lamda functions
    I suggest prefixing with ‘cb‘ or ‘lf‘, or more preferably, prefixing with the return type, such as ‘b’ for boolean

    $lfFunction = create_function('$arrParams','return 32');
    $bFunction = create_function('$arrParams','return true');

Vsibility

PEAR, a set of PHP libraries, already uses a convention for displaying a variable’s visibility. Private variables start with a ‘_’ and public variables start with a letter. I propose extending this to ‘_‘ for protected and ‘__‘ for private. I realize that this convention conflics with the Magic methods in PHP, but I don’t forsee a problem, since it probably isn’t a good idea to create a private method ‘__construct’ for example.

Other Uses

Hungarian notation isn’t a syntax; it’s an extensible coding methodology. Other uses can be concocted for this style of programming. For instance, if someone wanted to insert data into a database, but wanted to visually check if the data had been prepared for insertion, one could do this:

$sData = "blah'; delete from table where 1; -- ";
$csData = mysql_real_escape_string($sData);

Now you can visually check that all strings prefixed with ‘cs‘ will have been cleaned before entering the database. Any string prefixed with ‘s‘ is a potential security risk.

Conclusion

The important thing isn’t that you use this style of Hungarian notation in your code. What’s important is that you have a standard, predictable convention across your libraries.

At first you may think that it looks cluttered, but after some practice you’ll discover that it not only looks nice, but also helps you visually debug your code without you even having to think about it. It will likely help you reduce bugs and develop more reusable, cleaner code.

PHP without tags

Monday, September 7th, 2009 | Uncategorized | No Comments

Tags:

Meta

Search