|
| 1 | +[[ADR-0007]] |
| 2 | += ADR-0007: Proposal How to Mark Findings With Hashes to Find Duplicates |
| 3 | + |
| 4 | +[cols="h,d",grid=rows,frame=none,stripes=none,caption="Status",%autowidth] |
| 5 | +|==== |
| 6 | +// Use one of the ADR status parameter based on status |
| 7 | +// Please add a cross reference link to the new ADR on 'superseded' ADR. |
| 8 | +// e.g.: {adr_suposed_by} <<ADR-0000>> |
| 9 | +| Status |
| 10 | +| PROPOSED |
| 11 | + |
| 12 | +| Date |
| 13 | +| 2020-11-25 |
| 14 | + |
| 15 | +| Author(s) |
| 16 | +| Sven Strittmatter <Sven.Strittmatter@iteratec.com> |
| 17 | +// ... |
| 18 | +|==== |
| 19 | + |
| 20 | +NOTE: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in https://tools.ietf.org/html/rfc2119[RFC 2119]. |
| 21 | + |
| 22 | +== Context |
| 23 | + |
| 24 | +We need the possibility to find duplicate findings. One use case is that we want to accept a finding and want to ignore the same finding in the future. |
| 25 | + |
| 26 | +=== Assumptions |
| 27 | + |
| 28 | +* The execution order of _hooks_ is unspecified. |
| 29 | +* The information if a finding's hash is a duplicate MUST NOT be stored or maintained in the _SCB_ S3 storage. |
| 30 | +* The _SCB_ MUST NOT remove findings: _read-write-hooks_ may alter them, but never delete or filter them out. |
| 31 | +** Maybe a _read-hooks_ MAY decide to not store a finding into an external system. |
| 32 | + |
| 33 | +== Decision |
| 34 | + |
| 35 | +* We generate a hash for each finding so we can compare findings by the hash and identify duplicates. |
| 36 | +* This hash MUST be mutable and MAY be altered by _read-write-hooks_ because we don't want to introduce an exceptions to what a _read-write-hooks_ can alter. |
| 37 | +* The _parser_ MUST generate the initial hash of a finding from some of it's attributes (e.g. name, lication, category ...). |
| 38 | +** Each _scanner_ MUST define a default set of attributes used for the hashing. |
| 39 | +** This set of hashed attributes MAY be overwritten. |
| 40 | +* Each _read-write-hooks_ MUST update the hash as last step because the _hook_ MAY changed a hashed attribute. |
| 41 | + |
| 42 | +We implement the hashing step in the _parser_ first with feature flag to evaluate this proposal. |
| 43 | + |
| 44 | +== Consequences |
| 45 | + |
| 46 | +* We don't need to introduce an ordering for the _read-write-hooks_. |
| 47 | +* The duplicate detection/handling MUST be done in another service with its own data storage. This is because we have no stable hash until the _read-hooks_ will be executed and these MUST NOT alter the data in _SCB_ itself. But the _read-hooks_ MAY decide to not store data into an external system. |
0 commit comments