Visualizing Trees and ASTs with Graphviz
Visualization is a great opportunity to see some complex structures from another side. Let’s assume we want to inspect how trees actually look. We will use Breadth-first search (BFS) to traverse graph (or tree in our case) and show connections between nodes.
def bfs(node)
queue = [node]
discovered = []
while queue.length > 0 do
current = queue.shift
discovered << current
current.children.each do |child|
queue << child if child
end
end
discovered
end
The awesome fact about BFS is that it traverse nodes from left to right, so, we will traverse tree level-by-level. Let’s assume we already have tree class. We start from filling a tree:
tree = Tree.new
[8, 4, 12, 2, 6, 10, 14, 1, 3, 5, 7, 9, 11, 13, 15].each do |value|
tree.insert(value)
end
And now we will traverse all tree nodes and render it with left and right child:
([["Node", "Left", "Right"]] + bfs(tree.root).map do |node|
[node.value, node.left&.value || "NULL", node.right&.value || "NULL"]
end).each { |n| puts n.join(" ") }
This code produces the following table (additionally I piped it through column -t):
Node Left Right
8 4 12
4 2 6
12 10 14
2 1 3
6 5 7
10 9 11
14 13 15
1 NULL NULL
3 NULL NULL
5 NULL NULL
7 NULL NULL
9 NULL NULL
11 NULL NULL
13 NULL NULL
15 NULL NULL
It’s time to draw a tree! We will use graphviz for rendering.
According to Graphviz documentation we need to simply describe edges in proper format and graphviz will render the graph. Sounds easy. We will generate .dot file and then render it. To reduce complexity of manipulating with files we will write output to stdout and then pipe it into the graphviz tool. It will look like:
ruby tree.rb | dot | neato -n -Tpng -o tree.png
Additionally, to decrease feedback loop (in case if you are using iTerm), you could use build-in command imgcat, so it will look like:
ruby tree.rb | dot | neato -n -Tpng | imgcat
This code will generate our first node graph:
def render_tree(tree)
result = []
result << "digraph graphname {" # graph type
result << "node [shape=circle];" # we want pretty circles as nodes
bfs(tree).each do |node|
node.children.each do |child|
next if child.nil?
result << [node.value, child.value].join(" -> ")
end
end
result << "}"
result.join("\n")
end

It will generate Graphviz file with the following content:
digraph graphname {
node [shape=circle];
8 -> 4
8 -> 12
4 -> 2
4 -> 6
12 -> 10
12 -> 14
2 -> 1
2 -> 3
6 -> 5
6 -> 7
10 -> 9
10 -> 11
14 -> 13
14 -> 15
}
But there are some problems. Let’s assume we have more than one node with the same value ([5, 3, 7, 3, 7, 5]):

Hmm…, its not what we expect. We need more unique identificator and Ruby provides object_id method, which we can use to get unique identifier for each node.

Ok, it’s better, but still not so good, we solved one problem but there is another one. Graphviz allows you to define how to render nodes. In order to do this, you need to append a line with label’s contents.
New code:
def render_tree(tree)
result = []
result << "digraph graphname {"
result << "node [shape=circle];"
bfs(tree).each do |node|
result << "#{node.object_id} [label=#{node.value}]"
node.children.each do |child|
next if child.nil?
result << [node.object_id, child.object_id].join(" -> ")
end
end
result << "}"
result.join("\n")
end
This is file generated by our method:
digraph graphname {
node [shape=circle];
70120168878200 [label=5]
70120168878200 -> 70120168878140
70120168878200 -> 70120168878120
70120168878140 [label=3]
70120168878140 -> 70120168878100
70120168878120 [label=7]
70120168878120 -> 70120168878000
70120168878120 -> 70120168878020
70120168878100 [label=3]
70120168878000 [label=5]
70120168878020 [label=7]
}
Result is:

Ok, nice, but it’s not actually a binary tree. There is a nice answer on StackOverlow which gives you idea how to do render this tree as binary tree: https://stackoverflow.com/a/10905657.

Yes! So nice, actually what we want. But…

Hey! Node with value 6 is on the right of node with value 7 — is something wrong with the tree insertion code?
Node Left Right
5 3 7
3 NULL 4
7 6 NULL
4 NULL NULL
6 NULL NULL
Seems like not. We need to try something else and there is question on Graphviz site which should help us: http://www.graphviz.org/content/FaqBalanceTree. According to that we need to add invisible nodes to our tree. Additionally we need to add invisible nodes for nodes with just one leaf. To implement it we introduce new class InvisibleNode which we inherit from Node class.
class TreeRenderer
InvisibleNode = Class.new(Node)
def initialize(tree)
@tree = tree
end
def call
result = []
result << "digraph graphname {"
result << "node [shape=circle];"
bfs(tree).each do |node|
result << ["#{node.object_id}", %{[label=#{node.value} fixedsize=true]}].join(" ")
fix_missing_children(node).each do |child|
case child
when InvisibleNode
result << [child.object_id, %{[label="" width=.1 style=invis]}].join(" ")
result << [[node.object_id, child.object_id].join(" -> "), '[style=invis]'].join(" ")
when Node
result << [[node.object_id, child.object_id].join(" -> ")].join
end
end
end
result << "}"
result.join("\n")
end
private
attr_reader :tree
def fix_missing_children(node)
node
.children
.map { |child| child || generate_fake_child(node) }
.insert(1, generate_fake_child(node))
end
def generate_fake_child(node)
InvisibleNode.new.tap { |child| child.parent = node }
end
end
It’s not so cool now, because, actually, it’s complex. We could delegate nodes rendering to the Node class, but I think it’s not perfect idea to change actual “business-logic” code for rendering.

Additionally it will be nice to have ability to show NULL values, for debug purposes.
class TreeRenderer
InvisibleNode = Class.new(Node)
NilNode = Class.new(Node)
def initialize(tree, with_nil)
@tree = tree
@with_nil = with_nil
end
def call
result = []
result << "digraph graphname {"
result << "node [shape=circle];"
bfs(tree).each do |node|
result << ["#{node.object_id}", %{[label=#{node.value} fixedsize=true]}].join(" ")
fix_missing_children(node).each do |child|
case child
when InvisibleNode
result << [child.object_id, %{[label="" width=.1 style=invis]}].join(" ")
result << [[node.object_id, child.object_id].join(" -> "), '[style=invis]'].join(" ")
when NilNode
result << [child.object_id, %{[label="NULL" shape="box" style="#{with_nil ? '' : 'invis'}"]}].join(" ")
result << [[node.object_id, child.object_id].join(" -> "), %{[style="#{with_nil ? '' : 'invis'}"]}].join(" ")
when Node
result << [[node.object_id, child.object_id].join(" -> ")].join
end
end
end
result << "}"
result.join("\n")
end
private
attr_reader :tree, :with_nil
def fix_missing_children(node)
node
.children
.map { |child| child || generate_nil_child(node) }
.insert(1, generate_fake_child(node))
end
def generate_nil_child(node)
NilNode.new.tap { |child| child.parent = node }
end
def generate_fake_child(node)
InvisibleNode.new.tap { |child| child.parent = node }
end
end
And the result is:

How can you use it? With minimal changes you can render any binary tree. For example here is RB-tree rendered this way.

I used this code to debug my RB-tree implementation. There are dotted edges which represents child-parent connection. So, I think you can easily find what’s wrong here.

The same renderer works on any tree structure. Let’s apply it to something more practical.
Parsing Ruby ASTs
An AST is just a tree, so the renderer above works with minor adjustments. The Ruby parser gem exposes the AST for any source file. The canonical approach would be the Visitor pattern, but we can keep it simpler:
require 'parser/current'
class ASTRenderer
def initialize(tree)
@tree = tree
end
def call
result = []
result << "digraph graphname {"
bfs(@tree).each do |node|
result << ["#{node.object_id}", %Q{[label="#{present(node)}"]}].join(" ")
node.children.each do |child|
next unless child.is_a?(Parser::AST::Node)
result << [[node.object_id, child.object_id].join(" -> ")].join
end
end
result << "}"
result.join("\n")
end
private
def present(node)
return node.to_s unless node.is_a?(Parser::AST::Node)
case node.type
when :begin then "(begin)"
when :if then "(if)"
when :array then "(array)"
when :send then "(send :#{node.to_a[1]})"
when :lvar then "(lvar :#{node.to_a[0]})"
when :def then "(def :#{node.to_a[0]})"
when :str then "(str :#{node.to_a[0]})"
when :args then "(args)"
when :return then "(return)"
else node.to_s
end
end
def bfs(node)
queue = [node]
discovered = []
while queue.length > 0 do
current = queue.shift
next unless current.is_a?(Parser::AST::Node)
discovered << current
current.children.each do |child|
queue << child if child
end
end
discovered
end
end
source_code = <<RUBY
def factorial(n)
return 1 if [1, 2].includes?(n)
factorial(n - 1) + factorial(n - 2)
end
RUBY
ast = Parser::CurrentRuby.parse(source_code)
puts ASTRenderer.new(ast).call
Result:

Result with comments:

The changes from the binary tree renderer are minimal: no binary tree layout code, non-Parser::AST::Node children are skipped, and node labels come from the AST node type. Even on this small example the full AST is hard to read — there’s too much detail. But we can filter it down.
Let’s look at what a class definition node looks like:
class Foo < Bar
def body; end
end
s(:class,
s(:const, nil, :Foo),
s(:const, nil, :Bar),
s(:def, :body,
s(:args), nil))
s is s-expression notation for nested lists. The class node has three arguments: the class name constant, the parent class constant, and the body.
Building a Class Dependency Diagram
Let’s catch class definitions and store them:
class KlassDefinitionsProcessor < Parser::AST::Processor
attr_reader :klass_definitions
def initialize(*)
super
@klass_definitions = []
end
def on_class(node)
klass_konst, _parrent, _body = *node
_nested_konst, konst_name = *klass_konst
@klass_definitions << konst_name
super
end
end
source_code = <<RUBY
class Foo; end
class Bar; end
RUBY
ast = Parser::CurrentRuby.parse(source_code)
processor = KlassDefinitionsProcessor.new
processor.process(ast)
p processor.klass_definitions
# => [:Foo, :Bar]
Good start. But namespaced constants like Foo::Bar::Baz get truncated. Fix: use klass_konst.loc.expression.source instead of destructuring:
def on_class(node)
klass_konst, _parrent, _body = *node
@klass_definitions << klass_konst.loc.expression.source
super
end
class Foo::Bar::Baz; end
class Oof::Zab; end
# => ["Foo::Bar::Baz", "Oof::Zab"]
Now for the dependency edges. We need to track which constants are referenced inside each class. The problem: standard AST processing is top-down, but we need to know the enclosing class for any given constant. The solution is a nesting stack — push on entry, pop on exit:
def process(node)
@nesting.push(node)
super
@nesting.pop
end
Now we can easily access the current node’s parent:
def parent
@nesting.last(2).first
end
and find in which class we are processing current node:
def context_klass
@nesting.reverse.select { |x| x.type == :class }.first
end
Which gives us useful helpers:
def klass_name_for(node)
klass_konst, _, _ = *node
source_for(klass_konst)
end
def source_for(node)
node.loc.expression.source
end
Wiring it together — collect class definitions and constant references:
def on_class(node)
@klass_definitions << klass_name_for(node)
super
end
def on_const(node)
konst_name = source_for(node)
current_klass = klass_name_for(context_klass)
(@klass_dependencies[current_klass] ||= []) << konst_name
super
end
class Foo::Bar::Baz; end
class Oof::Zab; def call; Foo::Bar::Baz.new; end; end
# processor.klass_dependencies
# => {
# "Foo::Bar::Baz"=>["Foo::Bar::Baz", "Foo::Bar", "Foo"],
# "Oof::Zab"=>["Oof::Zab", "Oof", "Foo::Bar::Baz", "Foo::Bar", "Foo"]
# }
Too noisy — nested constants trigger on_const for each segment, and the class name itself is included. Skip those cases:
def on_const(node)
unless %i[class const].include?(parent.type)
konst_name = source_for(node)
if context_klass
current_klass = klass_name_for(context_klass)
(@klass_dependencies[current_klass] ||= []) << konst_name
end
end
super
end
class Foo::Bar::Baz; end
class Oof::Zab; def call; Foo::Bar::Baz.new; end; end
# {"Oof::Zab"=>["Foo::Bar::Baz"]}
Now render it with Graphviz:
class GraphRenderer
def initialize(relations, klasses)
@relations = relations
@klasses = klasses
end
def call
[
begin_header,
klasses_info,
relations_info,
end_header
].flatten.join("\n")
end
private
def begin_header
[
%(digraph graphname {),
%(rankdir="LR")
]
end
def klasses_info
@klasses.map do |klass|
[wrap(klass), %Q{[label="#{klass}"]}].join(" ")
end
end
def relations_info
@relations.map do |klass, dependents|
dependents.map do |child|
[wrap(klass), wrap(child)].join(' -> ')
end
end
end
def end_header
[%(})]
end
def wrap(text)
'"' + text + '"'
end
end

One remaining limitation: Ruby’s namespace resolution means B::C inside module A and A::B::C from outside resolve to the same class, but our processor treats them as different constants. Fully resolving this requires implementing Ruby’s constant lookup rules — left as a future exercise.