Skip to content

Commit 448df2e

Browse files
authored
cobalt/tools: Robustify analysis pipeline against invalid timestamps (#7862)
The smaps pipeline was too complicated, trying to aggregate some of the memory consumers which were similarly named, such as those with the pattern "mem/shared_memory", among others. There are concerns that this masked or dropped the smaps data, leading to low counts for PSS and RSS. This change speculatively removes this aggregation in case this is a problem. This also fixes some bugs in the visualization script and adds testing to ensure memory leaks are properly accounted for. The smaps analysis pipeline failed on a couple of attempts due to visualize_smaps_analysis.py receiving invalid timestamp strings from analyze_smaps_logs.py. This occurred when analyze_smaps_logs.py encountered log files with names that did not conform to the expected timestamp format. This change addresses the issue by: - Modifying analyze_smaps_logs.py's extract_timestamp function to return None for filenames that do not contain a valid timestamp, instead of a placeholder string. - Updating the file filtering logic in analyze_smaps_logs.py to skip any log files for which a valid timestamp cannot be extracted, preventing them from being passed to downstream visualization. - Correcting the regular expression in extract_timestamp to accurately match the _processed.txt suffix of the log files generated by read_smaps_batch.py. - Ensuring the run_analysis_pipeline.py script uses the correct log directory path. On top of that this adds a test hardening the data integrity for leak detection. The analyze_smaps_logs_test.py includes a test (test_analyze_logs_json_output) that simulates a memory leak. It creates dummy smaps files where a component (<leaking_lib>) shows increasing PSS and RSS values over time. This test then asserts that this time-series memory growth is correctly captured and structured within the JSON output. This ensures that the foundational data required for identifying leaks is accurately processed. #vibe-coded Bug: 456178181
1 parent df0ce3c commit 448df2e

File tree

5 files changed

+245
-199
lines changed

5 files changed

+245
-199
lines changed

cobalt/tools/performance/__init__.py

Lines changed: 0 additions & 15 deletions
This file was deleted.

cobalt/tools/performance/smaps/README.md

Lines changed: 33 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,35 @@ python3 smaps_capture.py [OPTIONS]
4949
python3 smaps_capture.py -o /tmp/my_smaps_logs -s R58M1293QYV
5050
```
5151

52+
## Unified Analysis Pipeline
53+
54+
To simplify the analysis process, the `run_analysis_pipeline.py` script combines the batch processing, analysis, and visualization steps into a single command.
55+
56+
### `run_analysis_pipeline.py`
57+
58+
This script takes a directory of raw smaps logs and generates the final visualization PNG, handling all intermediate steps automatically.
59+
60+
#### Usage
61+
62+
```bash
63+
python3 run_analysis_pipeline.py <RAW_LOG_DIR> [OPTIONS]
64+
```
65+
66+
#### Command-line Arguments
67+
68+
* `<RAW_LOG_DIR>` (required positional argument)
69+
The path to the directory containing the raw smaps log files.
70+
* `--output_image` (type: `str`, default: `smaps_analysis.png`)
71+
The path where the final output PNG image will be saved.
72+
* `--platform` (type: `str`, choices: `android`, `linux`, default: `android`)
73+
Specify the platform for platform-specific aggregations.
74+
75+
#### Example
76+
77+
```bash
78+
python3 run_analysis_pipeline.py cobalt_smaps_logs --output_image my_analysis.png --platform android
79+
```
80+
5281
## Smaps Analysis and Batch Processing
5382

5483
This directory also contains scripts for analyzing the captured smaps data.
@@ -128,12 +157,9 @@ python3 read_smaps_batch.py <RAW_LOGS_DIR> [OPTIONS]
128157

129158
### `analyze_smaps_logs.py`
130159

131-
After processing a batch of smaps files, you can use this script to analyze the entire run. It reads a directory of processed smaps files, tracks memory usage over time, and generates a summary report.
160+
After processing a batch of smaps files, you can use this script to inspect the final memory state of the run. It reads a directory of processed smaps files and prints a detailed, non-aggregated memory breakdown from the last log file in the time series.
132161

133-
The report includes:
134-
* The top 10 largest memory consumers by PSS and RSS at the end of the run.
135-
* The top 10 memory regions that have grown the most in PSS and RSS over the duration of the run.
136-
* The overall change in total PSS and RSS.
162+
The primary purpose of this script is to either provide a snapshot of the final memory layout or to generate a JSON file containing the full time-series data. This JSON output is essential for the `visualize_smaps_analysis.py` script, which handles aggregation and visualization.
137163

138164
The script can also output a structured JSON file containing the time-series data for further analysis or visualization.
139165

@@ -148,11 +174,11 @@ python3 analyze_smaps_logs.py <PROCESSED_LOG_DIR> [OPTIONS]
148174
* `<PROCESSED_LOG_DIR>` (required positional argument)
149175
The path to the directory containing the processed smaps log files.
150176
* `--json_output` (type: `str`)
151-
Optional: The path to a file where the JSON analysis output will be saved.
177+
Optional: The path to a file where the JSON analysis output will be saved. This is required for visualization.
152178

153179
#### Examples
154180

155-
1. **Print a text-based analysis to the console:**
181+
1. **Print a detailed memory breakdown to the console:**
156182
```bash
157183
python3 analyze_smaps_logs.py processed_logs
158184
```
@@ -197,69 +223,6 @@ python3 visualize_smaps_analysis.py <JSON_FILE> [OPTIONS]
197223
python3 visualize_smaps_analysis.py analysis_output.json --output_image my_analysis.png
198224
```
199225

200-
## Unified Analysis Pipeline
201-
202-
To simplify the analysis process, the `run_analysis_pipeline.py` script combines the batch processing, analysis, and visualization steps into a single command.
203-
204-
### `run_analysis_pipeline.py`
205-
206-
This script takes a directory of raw smaps logs and generates the final visualization PNG, handling all intermediate steps automatically.
207-
208-
#### Usage
209-
210-
```bash
211-
python3 run_analysis_pipeline.py <RAW_LOG_DIR> [OPTIONS]
212-
```
213-
214-
#### Command-line Arguments
215-
216-
* `<RAW_LOG_DIR>` (required positional argument)
217-
The path to the directory containing the raw smaps log files.
218-
* `--output_image` (type: `str`, default: `smaps_analysis.png`)
219-
The path where the final output PNG image will be saved.
220-
* `--platform` (type: `str`, choices: `android`, `linux`, default: `android`)
221-
Specify the platform for platform-specific aggregations.
222-
223-
#### Example
224-
225-
```bash
226-
python3 run_analysis_pipeline.py cobalt_smaps_logs --output_image my_analysis.png --platform android
227-
```
228-
229-
## Improving Aggregation Rules
230-
231-
The accuracy of this toolchain depends on its aggregation rules, which are heuristics based on known memory patterns. As Cobalt, Android, and third-party libraries evolve, new memory region names can appear. It is crucial to periodically check for and categorize these new regions to prevent gaps in the analysis.
232-
233-
### How to Check for New Patterns
234-
235-
1. **Temporarily Disable Aggregation:** Open `run_analysis_pipeline.py` and remove the `-d` (or `--aggregate_android`) flag from the `batch_args` list. This will cause the batch processor to output a "raw" report with no special grouping.
236-
237-
2. **Run the Pipeline:** Execute the modified script on a recent and representative set of `smaps` logs.
238-
239-
```bash
240-
python3 run_analysis_pipeline.py /path/to/your/recent/logs
241-
```
242-
243-
3. **Examine the Raw Output:** The analysis printed to the console will now be much more detailed. Scan the "Top Largest Consumers" and "Top Memory Increases" lists. Look for patterns or repeated names that are not being grouped, such as:
244-
* New `[anon:<name>]` labels (e.g., we discovered `[anon:scudo:*]`).
245-
* Driver or shared memory regions (e.g., `/dev/ashmem/*`).
246-
* JIT or code cache regions (e.g., `/memfd:jit-cache`).
247-
* Any other large, unexplained region that appears frequently.
248-
249-
4. **Add New Aggregation Rules:** Open `read_smaps.py` and add new `re.sub()` rules within the `if args.aggregate_android:` block. Place more specific rules *before* more general ones.
250-
251-
```python
252-
# Example for adding a new rule for Skia resources
253-
if args.aggregate_android:
254-
key = re.sub(r'\[(anon:skia.*)\]', r'<\1>', key) # New rule
255-
key = re.sub(r'\[(anon:scudo:.*)\]', r'<\1>', key)
256-
# ... other rules
257-
```
258-
259-
5. **Re-enable Aggregation:** Add the `-d` flag back to `run_analysis_pipeline.py` and re-run the pipeline to confirm that your new categories appear correctly.
260-
261-
By following this process periodically, you can maintain a comprehensive and accurate view of the application's memory usage.
262-
263226
## Extending the Toolchain with New Fields
264227

265228
The toolchain is designed to be extensible, allowing you to add new fields from the raw smaps files to the analysis and visualization. The key is to follow the data pipeline from the processor (`read_smaps.py`) to the analyzer (`analyze_smaps_logs.py`) and visualizer (`visualize_smaps_analysis.py`).

cobalt/tools/performance/smaps/analyze_smaps_logs.py

Lines changed: 53 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,10 @@
1616
"""Parses and analyzes processed smaps logs."""
1717

1818
import argparse
19-
from collections import defaultdict, OrderedDict
19+
from collections import OrderedDict
2020
import json
2121
import os
2222
import re
23-
import sys
2423

2524

2625
class ParsingError(Exception):
@@ -78,36 +77,19 @@ def parse_smaps_file(filepath):
7877
# This will skip non-integer lines, like the repeated header
7978
continue
8079

81-
# Second pass for aggregation
82-
aggregated_data = OrderedDict()
83-
shared_mem_total = defaultdict(int)
84-
for name, data in memory_data.items():
85-
if name.startswith('mem/shared_memory'):
86-
for field, value in data.items():
87-
shared_mem_total[field] += value
88-
else:
89-
aggregated_data[name] = data
90-
91-
if shared_mem_total:
92-
aggregated_data['[mem/shared_memory]'] = dict(shared_mem_total)
93-
94-
return aggregated_data, total_data
80+
return memory_data, total_data
9581

9682

9783
def extract_timestamp(filename):
9884
"""Extracts the timestamp (YYYYMMDD_HHMMSS) from the filename for sorting."""
99-
match = re.search(r'_(\d{8})_(\d{6})_\d{4}_processed\.txt$', filename)
85+
match = re.search(r'smaps_(\d{8})_(\d{6})_\d+_processed\.txt$', filename)
10086
if match:
10187
return f'{match.group(1)}_{match.group(2)}'
10288

103-
print(
104-
f"Warning: Could not extract timestamp from '{filename}'. "
105-
'File will be sorted last.',
106-
file=sys.stderr)
107-
return '00000000_000000' # Default for files without a clear timestamp
89+
return None # Return None if no timestamp is found
10890

10991

110-
def get_top_consumers(memory_data, metric='pss', top_n=5):
92+
def get_top_consumers(memory_data, metric='pss', top_n=10):
11193
"""Returns the top N memory consumers by a given metric."""
11294
if not memory_data:
11395
return []
@@ -120,35 +102,34 @@ def get_top_consumers(memory_data, metric='pss', top_n=5):
120102

121103
def analyze_logs(log_dir, json_output_filepath=None):
122104
"""Analyzes a directory of processed smaps logs."""
123-
all_files = [
124-
os.path.join(log_dir, f)
125-
for f in os.listdir(log_dir)
126-
if f.endswith('_processed.txt')
127-
]
128-
all_files.sort(key=extract_timestamp)
129-
130-
if not all_files:
131-
print(f'No processed smaps files found in {log_dir}')
105+
all_files_with_ts = []
106+
for f in os.listdir(log_dir):
107+
if f.endswith('_processed.txt'):
108+
filepath = os.path.join(log_dir, f)
109+
timestamp = extract_timestamp(os.path.basename(filepath))
110+
if timestamp:
111+
all_files_with_ts.append((timestamp, filepath))
112+
113+
if not all_files_with_ts:
114+
print(f'No processed smaps files with valid timestamps found in {log_dir}')
132115
return
133116

117+
# Sort files based on the extracted timestamp
118+
all_files_with_ts.sort(key=lambda x: x[0])
119+
all_files = [filepath for _, filepath in all_files_with_ts]
120+
134121
print(f'Analyzing {len(all_files)} processed smaps files...')
135122

136123
# List to store structured data for JSON output
137124
analysis_data = []
138125

139-
# Store data over time for each memory region
140-
total_history = defaultdict(list)
141-
142-
first_timestamp = None
143126
last_timestamp = None
144127
last_memory_data = None
145-
first_memory_data = None
128+
last_total_data = None
146129

147130
for filepath in all_files:
148131
filename = os.path.basename(filepath)
149132
timestamp = extract_timestamp(filename)
150-
if not first_timestamp:
151-
first_timestamp = timestamp
152133
last_timestamp = timestamp
153134

154135
try:
@@ -157,10 +138,10 @@ def analyze_logs(log_dir, json_output_filepath=None):
157138
print(f'Warning: {e}')
158139
continue
159140

160-
if first_memory_data is None:
161-
first_memory_data = memory_data
162-
last_memory_data = memory_data # Keep track of the last data
141+
last_memory_data = memory_data
142+
last_total_data = total_data
163143

144+
# Still collect all data for the JSON output
164145
current_snapshot = {
165146
'timestamp':
166147
timestamp,
@@ -174,11 +155,8 @@ def analyze_logs(log_dir, json_output_filepath=None):
174155
}
175156
analysis_data.append(current_snapshot)
176157

177-
for metric, value in total_data.items():
178-
total_history[metric].append(value)
179-
180158
print('\n' + '=' * 50)
181-
print(f'Analysis from {first_timestamp} to {last_timestamp}')
159+
print(f'Analysis of the last log: {last_timestamp}')
182160
print('=' * 50)
183161

184162
# Output JSON data if requested
@@ -187,57 +165,38 @@ def analyze_logs(log_dir, json_output_filepath=None):
187165
json.dump(analysis_data, f, indent=2)
188166
print(f'JSON analysis saved to {json_output_filepath}')
189167

190-
# 1. Largest Consumers by the end log
191-
print('\nOverall Total Memory Change:')
192-
print('\nTop 10 Largest Consumers by the End Log (PSS):')
193-
top_pss = get_top_consumers(last_memory_data, metric='pss', top_n=10)
194-
for name, data in top_pss:
195-
print(f" - {name}: {data.get('pss', 0)} kB PSS, "
196-
f"{data.get('rss', 0)} kB RSS")
197-
198-
print('\nTop 10 Largest Consumers by the End Log (RSS):')
199-
top_rss = get_top_consumers(last_memory_data, metric='rss', top_n=10)
200-
for name, data in top_rss:
201-
print(f" - {name}: {data.get('rss', 0)} kB RSS, "
202-
f"{data.get('pss', 0)} kB PSS")
203-
204-
# 2. Top 10 Increases in Memory Over Time
205-
print('\nTop 10 Memory Increases Over Time (PSS):')
206-
pss_growth = []
207-
if last_memory_data and first_memory_data:
208-
all_keys = set(first_memory_data.keys()) | set(last_memory_data.keys())
209-
for r_name in all_keys:
210-
initial_pss = first_memory_data.get(r_name, {}).get('pss', 0)
211-
final_pss = last_memory_data.get(r_name, {}).get('pss', 0)
212-
growth = final_pss - initial_pss
213-
if growth > 0:
214-
pss_growth.append((r_name, growth))
215-
216-
pss_growth.sort(key=lambda item: item[1], reverse=True)
217-
for name, growth in pss_growth[:10]:
218-
print(f' - {name}: +{growth} kB PSS')
219-
220-
print('\nTop 10 Memory Increases Over Time (RSS):')
221-
rss_growth = []
222-
if last_memory_data and first_memory_data:
223-
all_keys = set(first_memory_data.keys()) | set(last_memory_data.keys())
224-
for r_name in all_keys:
225-
initial_rss = first_memory_data.get(r_name, {}).get('rss', 0)
226-
final_rss = last_memory_data.get(r_name, {}).get('rss', 0)
227-
growth = final_rss - initial_rss
228-
if growth > 0:
229-
rss_growth.append((r_name, growth))
230-
231-
rss_growth.sort(key=lambda item: item[1], reverse=True)
232-
for name, growth in rss_growth[:10]:
233-
print(f' - {name}: +{growth} kB RSS')
168+
# 1. Top 10 Consumers from the final log
169+
if last_memory_data:
170+
print('\nTop 10 Largest Consumers by PSS:')
171+
top_pss = get_top_consumers(last_memory_data, metric='pss', top_n=10)
172+
for name, data in top_pss:
173+
print(f" - {name}: {data.get('pss', 0)} kB PSS")
174+
175+
print('\nTop 10 Largest Consumers by RSS:')
176+
top_rss = get_top_consumers(last_memory_data, metric='rss', top_n=10)
177+
for name, data in top_rss:
178+
print(f" - {name}: {data.get('rss', 0)} kB RSS")
179+
180+
# 2. Detailed breakdown from the final log
181+
if last_memory_data:
182+
print('\n' + '-' * 50)
183+
print('Full Memory Breakdown:')
184+
for name, data in last_memory_data.items():
185+
print(f" - {name}: {data.get('pss', 0)} kB PSS, "
186+
f"{data.get('rss', 0)} kB RSS")
187+
188+
if last_total_data:
189+
print('\n' + '-' * 50)
190+
print('Total Memory:')
191+
for metric, value in last_total_data.items():
192+
print(f' - {metric.upper()}: {value} kB')
234193

235194

236195
def run_smaps_analysis_tool(argv=None):
237196
"""Parses arguments and runs the smaps log analysis."""
238197
parser = argparse.ArgumentParser(
239-
description='Analyze processed smaps logs to identify '
240-
'memory consumers and growth.')
198+
description='Analyzes processed smaps logs to display the memory '
199+
'breakdown of the final log file.')
241200
parser.add_argument(
242201
'log_dir',
243202
type=str,
@@ -246,7 +205,8 @@ def run_smaps_analysis_tool(argv=None):
246205
parser.add_argument(
247206
'--json_output',
248207
type=str,
249-
help='Optional: Path to a file where JSON output will be saved.')
208+
help='Optional: Path to a file where JSON output will be saved for '
209+
'use with other tools like visualize_smaps_analysis.py.')
250210
args = parser.parse_args(argv)
251211
analyze_logs(args.log_dir, args.json_output)
252212

0 commit comments

Comments
 (0)