Hungarian Notation
Thursday, February 25th, 2010 | Uncategorized | No Comments
I have a friend who, when he first started programming, would name his variables and functions after super heroes and cartoon characters. A function that checked the status of a flag might look like this:
goFindSuperMan(catwoman);
This made debugging his code a nightmare. Without knowing it, he had obstuficated the code completely, while still making it interesting to read.
Despite a few exceptions like this, most programmers these days use reasonably intelligent naming conventions for their functions and variables. For instance, a function used to check the status of a flag will normally look somethig like this:
checkFlag(flagName);
However, even with good naming conventions, one of the problems with reading other people’s code is that often there is no way to know for sure what type of data a variable stores or if it has a special status.
The problems
Variable and Function Types
This can be a problem when you accidentally set an integer to a string, as would be possible in a function that requires an integer input for the day of the week, such as 4. Such a variable could be mistaken for a string input for the day of the week, such as “Wednesday”.
Variable Visibility
Also, often it is useful to know if a variable has a special status, such as being private or static, since that can affect the programming of implementations, child classes, or even the business logic of the application.
Object Oriented Languages
Many people argue that object oriented programming languages such as Java, C#, or PHP don’t need to follow these rules, for two reasons:
- The variable name makes it clear what the data type is. For instance, if an application has a class MyClass, then the variable representing it would be called MyClass, as in
MyClasses = array(new MyClass(), new MyClass());
- If the data type is not correctly set, the application can throw an Exception.
Although these are true, there are still three problems:
- there is no way to know if the variable has a special status, such as public, private, or static.
- Most object oriented programming languages still support primitive variable types such as int and float.
- So many programmers use such bad or mixed conventions when developing code, particularly with languages that are picked up by non-programmers, such as PHP, ECMAScript, and ActionScript.
The Solution
There is a simple solution to these problem known as Hungarian Notation, which not only improves the code’s readability by other’s, but also builds a sort of visual debugging mechanism into the code. For those of you who don’t know, Hungarian notation is a programming naming convention that semantically encodes the type of a variable in its own name. For example the integer dayOfTheWeek would be called iDayOfTheWeek, effectively removing the ambiguity mentioned above.
I program mostly in PHP, so I’m going to offer a PHP solution. Take from it what you will, because each language has slightly different syntax, primitives, variable statuses and visibilities.
Variable Types
PHP supports many variable primitives, including:
- four scalars
- boolean
- integer
- float
- string
- Two Compound:
- array
- object
- Two Special:
- resource
- null
- And three pseudo-types
- mixed
- number
- callback
Here is the Hungarian Notation I recommend for PHP best coding practices:
- for all of the prefixes, begin with a lower case for the prefix, and start with an uppercase or an underscore for the variable name such as:
$bBoolean = true; $b_boolean = true;
- These apply to functions as well:
function bCheckFlag($sFlagName) { return true; } - boolean
begin with a ‘b‘ as in ‘boolean‘ or an ‘f‘ as in ‘flag‘$bOption = true;
- Integer
begin with an ‘i‘$iCounter = 3;
- float
begin with ‘fp‘ for ‘floating point’, ‘d‘ for ‘double’, or ‘l‘ for ‘long’.$fp = rand(0,1);
- string
begin with an ‘s‘ or a ‘c‘ as in ‘char‘$sDayOfTheWeek = 'Wednesday';
- arrays
Pluralize arrays and/or prefix them with ‘a‘ or ‘arr‘$bOptions = array('true','false'); $arrOptions = array('true','false'); $MyClasses = array(new MyClass(), new MyClass());Note: pluralizing is better, since then the variable type is obvious. However, I have something else in mind with arr that I will show you in a later post.
- objects
Capitalize the first letter, and try to use the class name in the variable, or prefix with ‘obj‘.$MyClass = new MyClass(); $MyClass_temp = new MyClass();
Note: The reason I suggest prefixing with ‘obj’ is that, when creating a function, it may be a little excessive to write a function called MemberGetMember(). objGetMember() may be better, while still making it clear that there is an object returned.
- resourses
begin with an ‘r‘$rDatabase = mysql_connect(...);
- null
prefix with ‘v‘ for ‘void‘ Save ‘n’ for ‘numbers’public function vSetName($sValue) { $this->name = $sValue; } - mixed
Often in a loosly typed language like PHP, the variable type doesn’t matter. In that case, either prefix with ‘m‘ or don’t prefix at all$variable = 32.4; $mVariable = true;
- number
In a loosly typed language like PHP, numbers can be represented by integers, floats, or even strings. If some sort of number is required, prefix with ‘n‘$nAge = '32';
- callback and lamda functions
I suggest prefixing with ‘cb‘ or ‘lf‘, or more preferably, prefixing with the return type, such as ‘b’ for boolean$lfFunction = create_function('$arrParams','return 32'); $bFunction = create_function('$arrParams','return true');
Vsibility
PEAR, a set of PHP libraries, already uses a convention for displaying a variable’s visibility. Private variables start with a ‘_’ and public variables start with a letter. I propose extending this to ‘_‘ for protected and ‘__‘ for private. I realize that this convention conflics with the Magic methods in PHP, but I don’t forsee a problem, since it probably isn’t a good idea to create a private method ‘__construct’ for example.
Other Uses
Hungarian notation isn’t a syntax; it’s an extensible coding methodology. Other uses can be concocted for this style of programming. For instance, if someone wanted to insert data into a database, but wanted to visually check if the data had been prepared for insertion, one could do this:
$sData = "blah'; delete from table where 1; -- "; $csData = mysql_real_escape_string($sData);
Now you can visually check that all strings prefixed with ‘cs‘ will have been cleaned before entering the database. Any string prefixed with ‘s‘ is a potential security risk.
Conclusion
The important thing isn’t that you use this style of Hungarian notation in your code. What’s important is that you have a standard, predictable convention across your libraries.
At first you may think that it looks cluttered, but after some practice you’ll discover that it not only looks nice, but also helps you visually debug your code without you even having to think about it. It will likely help you reduce bugs and develop more reusable, cleaner code.