You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cobalt/tools: Robustify analysis pipeline against invalid timestamps (#7862)
The smaps pipeline was too complicated, trying to aggregate some of the
memory consumers which were similarly named, such as those with the
pattern "mem/shared_memory", among others. There are concerns that this
masked or dropped the smaps data, leading to low counts for PSS and RSS.
This change speculatively removes this aggregation in case this is a
problem. This also fixes some bugs in the visualization script and adds
testing to ensure memory leaks are properly accounted for.
The smaps analysis pipeline failed on a couple of attempts due to
visualize_smaps_analysis.py receiving invalid timestamp strings from
analyze_smaps_logs.py. This occurred when analyze_smaps_logs.py
encountered log files with names that did not conform to the expected
timestamp format.
This change addresses the issue by:
- Modifying analyze_smaps_logs.py's extract_timestamp function to return
None for filenames that do not contain a valid timestamp, instead of a
placeholder string.
- Updating the file filtering logic in analyze_smaps_logs.py to skip any
log files for which a valid timestamp cannot be extracted, preventing
them from being passed to downstream visualization.
- Correcting the regular expression in extract_timestamp to accurately
match the _processed.txt suffix of the log files generated by
read_smaps_batch.py.
- Ensuring the run_analysis_pipeline.py script uses the correct log
directory path.
On top of that this adds a test hardening the data integrity for leak
detection. The analyze_smaps_logs_test.py includes a test
(test_analyze_logs_json_output) that simulates a memory leak. It creates
dummy smaps files where a component (<leaking_lib>) shows increasing PSS
and RSS values over time. This test then asserts that this time-series
memory growth is correctly captured and structured within the JSON
output. This ensures that the foundational data required for identifying
leaks is accurately processed.
#vibe-coded
Bug: 456178181
To simplify the analysis process, the `run_analysis_pipeline.py` script combines the batch processing, analysis, and visualization steps into a single command.
55
+
56
+
### `run_analysis_pipeline.py`
57
+
58
+
This script takes a directory of raw smaps logs and generates the final visualization PNG, handling all intermediate steps automatically.
After processing a batch of smaps files, you can use this script to analyze the entire run. It reads a directory of processed smaps files, tracks memory usage over time, and generates a summary report.
160
+
After processing a batch of smaps files, you can use this script to inspect the final memory state of the run. It reads a directory of processed smaps files and prints a detailed, non-aggregated memory breakdown from the last log file in the time series.
132
161
133
-
The report includes:
134
-
* The top 10 largest memory consumers by PSS and RSS at the end of the run.
135
-
* The top 10 memory regions that have grown the most in PSS and RSS over the duration of the run.
136
-
* The overall change in total PSS and RSS.
162
+
The primary purpose of this script is to either provide a snapshot of the final memory layout or to generate a JSON file containing the full time-series data. This JSON output is essential for the `visualize_smaps_analysis.py` script, which handles aggregation and visualization.
137
163
138
164
The script can also output a structured JSON file containing the time-series data for further analysis or visualization.
To simplify the analysis process, the `run_analysis_pipeline.py` script combines the batch processing, analysis, and visualization steps into a single command.
203
-
204
-
### `run_analysis_pipeline.py`
205
-
206
-
This script takes a directory of raw smaps logs and generates the final visualization PNG, handling all intermediate steps automatically.
The accuracy of this toolchain depends on its aggregation rules, which are heuristics based on known memory patterns. As Cobalt, Android, and third-party libraries evolve, new memory region names can appear. It is crucial to periodically check forand categorize these new regions to prevent gapsin the analysis.
232
-
233
-
### How to Check for New Patterns
234
-
235
-
1. **Temporarily Disable Aggregation:** Open `run_analysis_pipeline.py` and remove the `-d` (or `--aggregate_android`) flag from the `batch_args` list. This will cause the batch processor to output a "raw" report with no special grouping.
236
-
237
-
2. **Run the Pipeline:** Execute the modified script on a recent and representative set of `smaps` logs.
3. **Examine the Raw Output:** The analysis printed to the console will now be much more detailed. Scan the "Top Largest Consumers" and "Top Memory Increases" lists. Look for patterns or repeated names that are not being grouped, such as:
244
-
* New `[anon:<name>]` labels (e.g., we discovered `[anon:scudo:*]`).
245
-
* Driver or shared memory regions (e.g., `/dev/ashmem/*`).
246
-
* JIT or code cache regions (e.g., `/memfd:jit-cache`).
247
-
* Any other large, unexplained region that appears frequently.
248
-
249
-
4. **Add New Aggregation Rules:** Open `read_smaps.py` and add new `re.sub()` rules within the `if args.aggregate_android:` block. Place more specific rules *before* more general ones.
250
-
251
-
```python
252
-
# Example for adding a new rule for Skia resources
253
-
if args.aggregate_android:
254
-
key = re.sub(r'\[(anon:skia.*)\]', r'<\1>', key) # New rule
5. **Re-enable Aggregation:** Add the `-d` flag back to `run_analysis_pipeline.py` and re-run the pipeline to confirm that your new categories appear correctly.
260
-
261
-
By following this process periodically, you can maintain a comprehensive and accurate view of the application's memory usage.
262
-
263
226
## Extending the Toolchain with New Fields
264
227
265
228
The toolchain is designed to be extensible, allowing you to add new fields from the raw smaps files to the analysis and visualization. The key is to follow the data pipeline from the processor (`read_smaps.py`) to the analyzer (`analyze_smaps_logs.py`) and visualizer (`visualize_smaps_analysis.py`).
0 commit comments