Offline call graph generator for Python 3
We use semantic versioning.
Pyan takes one or more Python source files, performs a (rather superficial) static analysis, and constructs a directed graph of the objects in the combined source, and how they define or use each other. The graph can be output for rendering by GraphViz or yEd, or as a plain-text dependency list.
This project has 2 official repositories:
- The original stable davidfraser/pyan.
- The development repository Technologicat/pyan
The PyPI package pyan3 is built from development
Pyan3 is back in active development. The analyzer has been modernized and tested on Python 3.10–3.14, with fixes for all modern syntax (walrus operator, match statements, async with, type aliases, inlined comprehension scopes in 3.12+, and more).
What's new in the revival:
- Full support for Python 3.10–3.14 syntax
- Module-level import dependency analysis (
--module-levelflag andcreate_modulegraph()API), with import cycle detection - Comprehensive test suite (80+ tests)
- Modernized build system and dependencies
This revival was carried out by Technologicat with Claude (Anthropic) as AI pair programmer. See AUTHORS.md for the full contributor history.
Defines relations are drawn with dotted gray arrows.
Uses relations are drawn with black solid arrows. Recursion is indicated by an arrow from a node to itself. Mutual recursion between nodes X and Y is indicated by a pair of arrows, one pointing from X to Y, and the other from Y to X.
Nodes are always filled, and made translucent to clearly show any arrows passing underneath them. This is especially useful for large graphs with GraphViz's fdp filter. If colored output is not enabled, the fill is white.
In node coloring, the HSL color model is used. The hue is determined by the filename the node comes from. The lightness is determined by depth of namespace nesting, with darker meaning more deeply nested. Saturation is constant. The spacing between different hues depends on the number of files analyzed; better results are obtained for fewer files.
Groups are filled with translucent gray to avoid clashes with any node color.
The nodes can be annotated by filename and source line number information.
The static analysis approach Pyan takes is different from running the code and seeing which functions are called and how often. There are various tools that will generate a call graph that way, usually using a debugger or profiling trace hooks, such as Python Call Graph.
In Pyan3, the analyzer was ported from compiler (good riddance) to a combination of ast and symtable, and slightly extended.
pip install pyan3
Pyan3 requires Python 3.10 or newer.
For SVG and HTML output, you need the dot command from Graphviz installed on your system (e.g. sudo apt-get install graphviz on Debian/Ubuntu, brew install graphviz on macOS). Dot output requires no extra system dependencies.
This repository uses uv for local builds and releases.
# install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh
# set up a development environment (editable install + dev/test extras)
uv sync --extra dev --extra test
# alternatively, use the helper wrapper
scripts/uv-dev.sh setup
# run the CLI locally
uv run pyan3 --help
# build distribution artifacts
uv build
# run the default test suite
uv run pytest tests -qHelper scripts are provided for common workflows:
./makedist.sh– builds wheels and source distributions viauv build../uploaddist.sh <version>– publishes artifacts, preferringuv publishwhen available.scripts/test-python-versions.sh– smoke-tests the package across the Python interpreters detected on your system.scripts/uv-dev.sh– wraps the most common uv commands (setup, test, lint, build, matrix tests). Run with no arguments for an interactive menu.
If you are new to uv, read CONTRIBUTING.md for a concise onboarding guide that covers:
- Installing uv and managing Python versions.
- Creating project environments, installing an editable copy, and running tests/builds/lint.
- Using helper scripts such as
scripts/uv-dev.shandscripts/test-python-versions.sh. - Links to the ROADMAP and open issues (e.g., #105) if you are looking for contribution ideas.
See pyan3 --help.
Example:
pyan *.py --uses --no-defines --colored --grouped --annotated --dot >myuses.dot
Then render using your favorite GraphViz filter, mainly dot or fdp:
dot -Tsvg myuses.dot >myuses.svg
Or use directly
pyan *.py --uses --no-defines --colored --grouped --annotated --svg >myuses.svg
You can also export as an interactive HTML
pyan *.py --uses --no-defines --colored --grouped --annotated --html > myuses.html
Or as a plain-text dependency list
pyan *.py --uses --no-defines --text
Alternatively, you can call pyan from a script
import pyan
from IPython.display import HTML
HTML(pyan.create_callgraph(filenames="**/*.py", format="html"))You can integrate callgraphs into Sphinx.
Install graphviz (e.g. via sudo apt-get install graphviz) and modify source/conf.py so that
# modify extensions
extensions = [
...
"sphinx.ext.graphviz"
"pyan.sphinx",
]
# add graphviz options
graphviz_output_format = "svg"
Now, there is a callgraph directive which has all the options of the graphviz directive and in addition:
- :no-groups: (boolean flag): do not group
- :no-defines: (boolean flag): if to not draw edges that show which functions, methods and classes are defined by a class or module
- :no-uses: (boolean flag): if to not draw edges that show how a function uses other functions
- :no-colors: (boolean flag): if to not color in callgraph (default is coloring)
- :nested-groups: (boolean flag): if to group by modules and submodules
- :annotated: (boolean flag): annotate callgraph with file names
- :direction: (string): "horizontal" or "vertical" callgraph
- :toctree: (string): path to toctree (as used with autosummary) to link elements of callgraph to documentation (makes all nodes clickable)
- :zoomable: (boolean flag): enables users to zoom and pan callgraph
Example to create a callgraph for the function pyan.create_callgraph that is
zoomable, is defined from left to right and links each node to the API documentation that
was created at the toctree path api.
.. callgraph:: pyan.create_callgraph
:toctree: api
:zoomable:
:direction: horizontal
If GraphViz says trouble in init_rank, try adding -Gnewrank=true, as in:
dot -Gnewrank=true -Tsvg myuses.dot >myuses.svg
Usually either old or new rank (but often not both) works; this is a long-standing GraphViz issue with complex graphs.
If the graph is visually unreadable due to too much detail, consider visualizing only a subset of the files in your project. Any references to files outside the analyzed set will be considered as undefined, and will not be drawn.
For a higher-level view, use --module-level mode (see below).
The --module-level flag switches pyan3 from call-graph mode to module-level import dependency analysis. Instead of graphing individual functions and methods, it shows which modules import which other modules.
pyan3 --module-level pkg/**/*.py --dot -c -e >modules.dot
pyan3 --module-level pkg/**/*.py --dot -c -e | dot -Tsvg >modules.svg
The module-level mode has its own set of options (separate from the call-graph mode). Use pyan3 --module-level --help for the full list. Key options:
--dot,--svg,--html,--tgf,--yed,--text— output format (default: dot)-c,--colored— color by package-g,--grouped— group by namespace-e,--nested-groups— nested subgraph clusters (implies-g)-C,--cycles— detect and report import cycles to stdout--dot-rankdir— layout direction (TB,LR,BT,RL)--root— project root directory (file paths are made relative to this before deriving module names; if omitted, cwd is assumed)
The -C flag performs exhaustive import cycle detection using depth-first search (DFS) from every module:
pyan3 --module-level pkg/**/*.py -C
This finds all unique import cycles in the analyzed module set, and reports statistics (count, min/average/median/max cycle length). Note that for large codebases, the number of cycles can be large — most are harmless consequences of cross-package imports.
If a cycle is actually causing an ImportError, you usually already know which cycle from the traceback. The -C flag provides a broader view of what other cycles exist.
import pyan
# Generate a module dependency graph as a DOT string
dot_source = pyan.create_modulegraph(
filenames="pkg/**/*.py",
root=".", # project root; paths made relative to this
format="dot", # also: "svg", "html", "tgf", "yed", "text"
colored=True,
nested_groups=True,
)See pyan.create_modulegraph() for the full list of parameters.
Items tagged with ☆ are new in Pyan3.
Graph creation:
- Nodes for functions and classes
- Edges for defines
- Edges for uses
- This includes recursive calls ☆
- Grouping to represent defines, with or without nesting
- Coloring of nodes by filename
- Unlimited number of hues ☆
Analysis:
- Name lookup across the given set of files
- Nested function definitions
- Nested class definitions ☆
- Nested attribute accesses like
self.a.b☆ - Inherited attributes ☆
- Pyan3 looks up also in base classes when resolving attributes. In the old Pyan, calls to inherited methods used to be picked up by
contract_nonexistents()followed byexpand_unknowns(), but that often generated spurious uses edges (because the wildcard to*.nameexpands toX.namefor allXthat have an attribute calledname.).
- Pyan3 looks up also in base classes when resolving attributes. In the old Pyan, calls to inherited methods used to be picked up by
- Resolution of
super()based on the static type at the call site ☆ - MRO is (statically) respected in looking up inherited attributes and
super()☆ - Assignment tracking with lexical scoping
- E.g. if
self.a = MyFancyClass(), the analyzer knows that any references toself.apoint toMyFancyClass - All binding forms are supported (assign, augassign, for, comprehensions, generator expressions, with) ☆
- Name clashes between
forloop counter variables and functions or classes defined elsewhere no longer confuse Pyan.
- Name clashes between
- E.g. if
selfis defined by capturing the name of the first argument of a method definition, like Python does. ☆- Simple item-by-item tuple assignments like
x,y,z = a,b,c☆ - Chained assignments
a = b = c☆ - Local scope for lambda, listcomp, setcomp, dictcomp, genexpr ☆
- Keep in mind that list comprehensions gained a local scope (being treated like a function) only in Python 3. Thus, Pyan3, when applied to legacy Python 2 code, will give subtly wrong results if the code uses list comprehensions.
- Source filename and line number annotation ☆
- The annotation is appended to the node label. If grouping is off, namespace is included in the annotation. If grouping is on, only source filename and line number information is included, because the group title already shows the namespace.
For the full list of planned improvements and known limitations, see TODO_DEFERRED.md.
- Determine confidence of detected edges (probability that the edge is correct)
- Improve the wildcard resolution mechanism, see discussion here
- Type inference for function arguments (would reduce wildcard noise)
- Prefix methods by class name in the graph; create a legend for annotations. See the discussion here
The analyzer does not currently support:
- Tuples/lists as first-class values (currently ignores any assignment of a tuple/list to a single name)
- Starred assignment
a,*b,c = d,e,f,g,h(basic tuple unpacking works; starred targets overapproximate) - Slicing and indexing in assignment (
ast.Subscript) - Additional unpacking generalizations (PEP 448, Python 3.5+)
- Any uses on the RHS at the binding site in all of the above are already detected by the name and attribute analyzers, but the binding information from assignments of these forms will not be recorded (at least not correctly).
- Enums; need to mark the use of any of their attributes as use of the Enum
- Resolving results of function calls, except for a very limited special case for
super() - Distinguishing between different Lambdas in the same namespace
- Type inference for function arguments
From the viewpoint of graphing the defines and uses relations, the interesting parts of the AST are bindings (defining new names, or assigning new values to existing names), and any name that appears in an ast.Load context (i.e. a use). The latter includes function calls; the function's name then appears in a load context inside the ast.Call node that represents the call site.
Bindings are tracked, with lexical scoping, to determine which type of object, or which function, each name points to at any given point in the source code being analyzed. This allows tracking things like:
def some_func():
pass
class MyClass:
def __init__(self):
self.f = some_func
def dostuff(self)
self.f()By tracking the name self.f, the analyzer will see that MyClass.dostuff() uses some_func().
The analyzer also needs to keep track of what type of object self currently points to. In a method definition, the literal name representing self is captured from the argument list, as Python does; then in the lexical scope of that method, that name points to the current class (since Pyan cares only about object types, not instances).
Of course, this simple approach cannot correctly track cases where the current binding of self.f depends on the order in which the methods of the class are executed. To keep things simple, Pyan decides to ignore this complication, just reads through the code in a linear fashion (twice so that any forward-references are picked up), and uses the most recent binding that is currently in scope.
When a binding statement is encountered, the current namespace determines in which scope to store the new value for the name. Similarly, when encountering a use, the current namespace determines which object type or function to tag as the user.
See AUTHORS.md.
GPL v2, as per comments here.
