CodeQL for C/C++
.. rst-class:: setup
For this example you need to set up CodeQL for Visual Studio Code and download the CodeQL database for ChakraCore from GitHub.
.. rst-class:: agenda
- Control flow graphs
- Exercise: use after free
- Recursion over the control flow graph
- Basic blocks
- Guard conditions
We frequently want to ask questions about the possible order of execution for a program.
Example:
if (x) {
return 1;
} else {
return 2;
}Possible execution order is usually represented by a control flow graph:
.. graphviz::
digraph {
graph [ dpi = 1000 ]
node [shape=polygon,sides=4,color=blue4,style="filled,rounded",fontname=consolas,fontcolor=white]
a [label=<if<BR /><FONT POINT-SIZE="10">IfStmt</FONT>>]
b [label=<x<BR /><FONT POINT-SIZE="10">VariableAccess</FONT>>]
c [label=<1<BR /><FONT POINT-SIZE="10">Literal</FONT>>]
d [label=<2<BR /><FONT POINT-SIZE="10">Literal</FONT>>]
e [label=<return<BR /><FONT POINT-SIZE="10">ReturnStmt</FONT>>]
f [label=<return<BR /><FONT POINT-SIZE="10">ReturnStmt</FONT>>]
a -> b
b -> {c, d}
c -> e
d -> f
}
Note
The control flow graph is a static over-approximation of possible control flow at runtime. Its nodes are program elements such as expressions and statements. If there is an edge from one node to another, then it means that the semantic operation corresponding to the first node may be immediately followed by the operation corresponding to the second node. Some nodes (such as conditions of “if” statements or loop conditions) have more than one successor, representing conditional control flow at runtime.
The control flow is modeled with a CodeQL class, ControlFlowNode. Examples of control flow nodes include statements and expressions.
ControlFlowNodeprovides API for traversing the control flow graph:ControlFlowNode ControlFlowNode.getASuccessor()ControlFlowNode ControlFlowNode.getAPredecessor()ControlFlowNode ControlFlowNode.getATrueSuccessor()ControlFlowNode ControlFlowNode.getAFalseSuccessor()
The control-flow graph is intra-procedural–in other words, only models paths within a function. To find the associated function, use
Function ControlFlowNode.getControlFlowScope()
Note
The control flow graph is similar in concept to data flow graphs. In contrast to data flow, however, the AST nodes are directly control flow graph nodes.
The predecessor/successor predicates are prime examples of member predicates with results that are used in functional syntax, but that are not actually functions. This is because a control flow node may have any number of predecessors and successors (including zero or more than one).
Find calls to free that are reachable from an allocation on the same variable:
.. literalinclude:: ../query-examples/cpp/control-flow-cpp-1.ql :language: ql
Note
Predicates allocationCall and freeCall are defined in the standard library and model a number of standard alloc/free-like functions.
Based on this query, write a query that finds accesses to the variable that occur after the free.
.. rst-class:: build
What do you find? What problems occur with this approach to detecting use-after-free vulnerabilities?
.. rst-class:: build .. literalinclude:: ../query-examples/cpp/control-flow-cpp-2.ql :language: ql
The main problem we observed in the previous exercise was that the successor's relation is unaware of changes to the variable that would invalidate our results.
We can fix this by writing our own successor predicate that stops traversing the CFG if the variable is re-defined.
ControlFlowNode reachesWithoutReassignment(FunctionCall free, LocalScopeVariable v)
{
freeCall(free, v.getAnAccess()) and
(
// base case
result = free
or
// recursive case
exists(ControlFlowNode mid |
mid = reachesWithoutReassignment(free, v) and
result = mid.getASuccessor() and
// stop tracking when the value may change
not result = v.getAnAssignedValue() and
not result.(AddressOfExpr).getOperand() = v.getAnAccess()
)
)
}Find local variables that are written to, and then never accessed again.
Hint: Use LocalVariable.getAnAssignment().
.. rst-class:: build
.. literalinclude:: ../query-examples/cpp/control-flow-cpp-3.ql
:language: ql
.. rst-class:: background2
BasicBlock represents basic blocks, that is, straight-line sequences of control flow nodes without branching.
ControlFlowNode BasicBlock.getNode(int)BasicBlock BasicBlock.getASuccessor()BasicBlock BasicBlock.getAPredecessor()BasicBlock BasicBlock.getATrueSuccessor()BasicBlock BasicBlock.getAFalseSuccessor()
Often, queries can be made more efficient by treating basic blocks as a unit instead of reasoning about individual control flow nodes.
Write a query to find unreachable basic blocks.
Hint: First define a recursive predicate to identify reachable blocks. Class EntryBasicBlock may be useful.
.. rst-class:: build
.. literalinclude:: ../query-examples/cpp/control-flow-cpp-4.ql :language: ql
Note
This query has a good number of false positives on Chakra, many of them to do with templating and macros.
A GuardCondition is a Boolean condition that controls one or more basic blocks in the sense that it is known to be true/false at the entry of those blocks.
GuardCondition.controls(BasicBlock bb, boolean outcome):the entry of bb can only be reached if the guard evaluates to outcomeGuardCondition.comparesLt, GuardCondition.ensuresLt, GuardCondition.comparesEq:auxiliary predicates to identify conditions that guarantee that one expression is less than/equal to another
- CodeQL for C/C++: https://codeql.github.com/docs/codeql-language-guides/codeql-for-cpp/
- API reference: https://codeql.github.com/codeql-standard-libraries/cpp
.. rst-class:: end-slide
.. rst-class:: background2
The default implementation of call target resolution does not handle function pointers, because they are difficult to deal with in general.
We can, however, add support for particular patterns of use by contributing a new override of Call.getTarget.
Write a query that finds all calls for which no call target can be determined, and run it on libjpeg-turbo.
Examine the results. What do you notice?
.. rst-class:: build
.. code-block:: ql
import cpp
from Call c
where not exists(c.getTarget())
select c
.. rst-class:: build
- Many results are calls through struct fields emulating virtual dispatch.
Write a query that resolves the call at cjpeg.c:640.
Hint: Use classes ExprCall, PointerDereferenceExpr, and Access.
.. rst-class:: build
.. literalinclude:: ../query-examples/cpp/control-flow-cpp-5.ql :language: ql
Create a subclass of ExprCall that uses your query to implement getTarget.
.. rst-class:: build
class CallThroughVariable extends ExprCall {
Variable v;
CallThroughVariable() {
exists(PointerDereferenceExpr callee | callee = getExpr() |
callee.getOperand() = v.getAnAccess()
)
}
override Function getTarget() {
result = super.getTarget() or
exists(Access init | init = v.getAnAssignedValue() |
result = init.getTarget()
)
}
}The default control-flow graph implementation recognizes a few common patterns for non-returning functions, but sometimes it fails to spot them, which can cause imprecision.
We can add support for new non-returning functions by overriding ControlFlowNode.getASuccessor().
Write a query that finds all calls to a field called error_exit.
Hint: Reuse (parts of) the CallThroughVariable class from before.
.. rst-class:: build
class CallThroughVariable extends ExprCall { ... }
class ErrorExitCall extends CallThroughVariable {
override Field v;
ErrorExitCall() { v.getName() = "error_exit" }
}
from ErrorExitCall eec
select eecOverride ControlFlowNode to mark calls to error_exit as non-returning.
Hint: ExprCall is an indirect subclass of ControlFlowNode.
.. rst-class:: build
class CallThroughVariable extends ExprCall { ... }
class ErrorExitCall extends CallThroughVariable {
override Field v;
ErrorExitCall() { v.getName() = "error_exit" }
override ControlFlowNode getASuccessor() { none() }
}The Options library defines a CustomOptions class with various member predicates that can be overridden to customize aspects of the analysis.
In particular, it has an exprExits predicate that can be overridden to more easily perform the customization on the previous slide:
import Options
class MyOptions extends CustomOptions {
override predicate exprExits(Expr e) {
super.exprExits(e) or ...
}
}