C++: Use the new dataflow library in cpp/missing-check-scanf#12818
C++: Use the new dataflow library in cpp/missing-check-scanf#12818MathiasVP merged 1 commit intogithub:mainfrom
cpp/missing-check-scanf#12818Conversation
3d8d3f6 to
bef7b00
Compare
bef7b00 to
0db05fe
Compare
|
The DCA run claims there are 10,258 fixed alerts for |
Indeed. It's also by far the query with the most number of results on CodeQL-data. I think one of the problems is that the original query raised an issue on any use of the variable that was identified as being problematic. However, by disabling flow out of sinks (i.e., using |
|
I'm having difficulty interpreting the DCA query differences to be honest, I'll see if things look a bit clearer on MRVA instead. |
geoffw0
left a comment
There was a problem hiding this comment.
QL looks fine.
On my (small) MRVA run I saw a reduction in duplicate results along the lines of:
scanf(..., x);
a = x; // result
b = x; // duplicate result
👍
This PR rewrites the
cpp/missing-check-scanfquery to use the new use-use based dataflow library.
I'm a bit disappointed that there is still so much manually written dataflow (the fwdFlow and revFlow predicates in particular) that the data flow libraries are apparently unable to do for us. I had hoped to see the whole query reduced to one or two standard looking dataflow configs.
I guess we don't actually want interprocedural dataflow in this case?
| * Holds if `instr` is control-flow reachable in 0 or more steps from | ||
| * a call to a `scanf`-like function. | ||
| */ | ||
| /** Holds if `n` reaches an argument to a call to a `scanf`-like function. */ |
There was a problem hiding this comment.
| /** Holds if `n` reaches an argument to a call to a `scanf`-like function. */ | |
| /** Holds if `n` reaches an argument to a call to a `scanf`-like function. */ |
|
We could convert it to interprocedural dataflow quite easily, yeah. That would get rid of those forwards and backwards localflow predicates. I just wasn't sure if we wanted to do that. |
|
I don't think interprocedural results will be interesting. I think this is a mistake people make locally and interprocedural results will more likely be a source of false positives. I guess I'd look to use |
Exactly. This PR does use Dataflow::localFlowStep, but performs the transitive closure manually to get sanitization |
This PR rewrites the
cpp/missing-check-scanfquery to use the new use-use based dataflow library.When we introduced the query #10163 it was deemed too difficult to use dataflow for the query since we couldn't properly mark one
scanfas being a sanitizer of another call toscanf. However, this is now completely trivial with use-use flow. That makes the query a perfect fit for dataflow now 🎉I've also changed the query to only raise an alert if a variable isn't initialized prior to the
scanfcall. This removes a very large number of results, and I think this is for the better.I'm running a difference query on MRVA now to see the impact of this change.