PHP zvals and symbol table(s)

I've had a broad understanding of how PHP maintains and stores variables, but recently I did some more reading about it so that I could understand a solution on stackoverflow better.

zvals Container

When you assign a variable it creates a "row" in what is known as a zval container. At this stage I find it easiest to think of them like table rows, or an array item. They contain:

  • Type of the variable
  • Value of the variable
  • a boolean value for is_ref. This lets PHP know if the variable is a reference
  • refcount, which contains how many variables point to this value. This doesn't always mean it is a variable assigned by reference. If for instance you do
// refcount is 1 after you set this
$a = 'Levi Jackson';
// refcount is 2 after this point
$b = $a;

// refcount for $a is 1 and $b is 1
$b = 'Jackson Levi';

It is important to note that the zvals table does not include variable names. Those are stored in what is called a symbol table.

Symbol Table

There is a symbol table for each level of scope (global, functions, methods, etc...). as well for certain types value types. Scalar values like a string are pretty straight forward, at the most basic level a symbol table maps a variable name to a zval slot. Compound Types such as arrays and objects create separate symbol tables for their properties. So for an object like the following:

Class Person{
public $name;
public $email;

public function __construct($name = '', $email = ''){
    $this->name = $name;
    $this->email = $email;
    }
}

$levi = new Person('Levi', 'test@test.com');

This would add an entry for $levi in the global symbol table and then it would create two more symbol tables for both properties name and email.

Garbage Collection

Once a variable in the zval container has 0 for a refcount it is deleted from memory. An interesting example posed on the php.net site though shows where you can lull yourself into the false idea that all of your values have been removed from memory. For instance, given the following

$tmpArray = array( 'name' => 'Levi');

You have a symbol table for $tmpArray as well as for $tmpArray['name']. If you were to then unset($tmpArray) that would remove the value for $tmpArray, but the key 'name' would still exist because it still has a refcount of 1 since it has a value of 'Levi'. So to remove that you would need to unset($tmparray['name']) and then unset($tmpArray) to remove both from memory.

Pretty interesting stuff, and I hope to do some more reading on the internals of PHP soon. One of the more interesting things I've come across recently related to the above is that the foreach loop has the potential to copy the array it is looping (and rightly so) if the refcount for the array variable is more than 1.

http://php.net/manual/en/features.gc.refcounting-basics.php

If you have any feedback for me, I'd love to hear it - corrections, alternative paths, you name it! Send me an email levi@levijackson.xyz