Skip to content

Commit 3d88005

Browse files
authored
cobalt/tools: Improved smaps analysis and visualization (#7829)
This commit introduces improvements to the smaps memory analysis toolchain: - Structured JSON Output: analyze_smaps_logs.py now supports an --json_output argument to generate a machine-readable JSON file containing time-series memory data. This enables external tools to consume and visualize the analysis results. - Memory Visualization Script: A new script, visualize_smaps_analysis.py, has been added. It consumes the JSON output and generates a comprehensive PNG dashboard with three key charts: - Total PSS and RSS over time. - Stacked area chart of the top 10 memory consumers. - Line chart of the top 10 memory regions with the most growth (potential leaks). - Top 10 Reporting: The analysis and visualization now consistently report the top 10 memory consumers and growers, providing a more detailed view. - Shared Memory Aggregation: All memory regions starting with "mem/shared_memory" are now aggregated into a single `[mem/shared_memory]` category for clearer analysis of shared memory impact. - Swap Memory Parsing: The toolchain now correctly parses and includes Swap and SwapPss memory fields in the analysis. - Updated Documentation and Tests: The README.md has been updated to reflect the new usage and functionality. Corresponding unit tests have been added or updated to ensure the correctness of these new features. #vibe-coded Bug: 456178181
1 parent a92f68b commit 3d88005

File tree

7 files changed

+530
-84
lines changed

7 files changed

+530
-84
lines changed

cobalt/tools/performance/smaps/README.md

Lines changed: 135 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -131,22 +131,150 @@ python3 read_smaps_batch.py <SMAPS_FILES...> [OPTIONS]
131131
After processing a batch of smaps files, you can use this script to analyze the entire run. It reads a directory of processed smaps files, tracks memory usage over time, and generates a summary report.
132132

133133
The report includes:
134-
* The top 5 largest memory consumers by PSS and RSS at the end of the run.
135-
* The top 5 memory regions that have grown the most in PSS and RSS over the duration of the run.
134+
* The top 10 largest memory consumers by PSS and RSS at the end of the run.
135+
* The top 10 memory regions that have grown the most in PSS and RSS over the duration of the run.
136136
* The overall change in total PSS and RSS.
137137

138+
The script can also output a structured JSON file containing the time-series data for further analysis or visualization.
139+
138140
#### Usage
139141

140142
```bash
141-
python3 analyze_smaps_logs.py <PROCESSED_LOG_DIR>
143+
python3 analyze_smaps_logs.py <PROCESSED_LOG_DIR> [OPTIONS]
142144
```
143145

146+
#### Command-line Arguments
147+
148+
* `<PROCESSED_LOG_DIR>` (required positional argument)
149+
The path to the directory containing the processed smaps log files.
150+
* `--json_output` (type: `str`)
151+
Optional: The path to a file where the JSON analysis output will be saved.
152+
153+
#### Examples
154+
155+
1. **Print a text-based analysis to the console:**
156+
```bash
157+
python3 analyze_smaps_logs.py processed_logs
158+
```
159+
160+
2. **Generate a JSON output file for visualization:**
161+
```bash
162+
python3 analyze_smaps_logs.py processed_logs --json_output analysis_output.json
163+
```
164+
165+
### `visualize_smaps_analysis.py`
166+
167+
This script takes the JSON output from `analyze_smaps_logs.py` and generates a dashboard-style PNG image with three plots:
168+
1. Total PSS and RSS memory usage over time.
169+
2. A stacked area chart of the top 10 memory consumers.
170+
3. A line chart of the top 10 memory growers.
171+
172+
This provides a quick and intuitive way to visualize memory behavior and identify potential leaks.
173+
174+
#### Prerequisites
175+
176+
This script requires the `pandas` and `matplotlib` libraries. You can install them using pip:
177+
```bash
178+
pip install pandas matplotlib
179+
```
180+
181+
#### Usage
182+
183+
```bash
184+
python3 visualize_smaps_analysis.py <JSON_FILE> [OPTIONS]
185+
```
186+
187+
#### Command-line Arguments
188+
189+
* `<JSON_FILE>` (required positional argument)
190+
The path to the input JSON file generated by `analyze_smaps_logs.py`.
191+
* `--output_image` (type: `str`, default: `smaps_analysis.png`)
192+
The path where the output PNG image will be saved.
193+
144194
#### Example
145195

146196
```bash
147-
python3 analyze_smaps_logs.py processed_logs
197+
python3 visualize_smaps_analysis.py analysis_output.json --output_image my_analysis.png
148198
```
149199

200+
## Extending the Toolchain with New Fields
201+
202+
The toolchain is designed to be extensible, allowing you to add new fields from the raw smaps files to the analysis and visualization. The key is to follow the data pipeline from the processor (`read_smaps.py`) to the analyzer (`analyze_smaps_logs.py`) and visualizer (`visualize_smaps_analysis.py`).
203+
204+
Here is a step-by-step guide using the `Locked` field as an example.
205+
206+
### Step 1: Update the Processor (`read_smaps.py`)
207+
208+
This is the most critical step to get the new data into the processed files.
209+
210+
1. **Add the new field to the `fields` tuple:**
211+
In `cobalt/tools/performance/smaps/read_smaps.py`, add your new field (in lowercase) to the `fields` string.
212+
213+
```python
214+
# --- BEFORE ---
215+
# fields = ('size rss pss ... swap swap_pss').split()
216+
217+
# --- AFTER ---
218+
fields = ('size rss pss shr_clean shr_dirty priv_clean priv_dirty '
219+
'referenced anonymous anonhuge swap swap_pss locked').split()
220+
MemDetail = namedtuple('name', fields)
221+
```
222+
223+
2. **Update the `MemDetail` creation:**
224+
The `parse_smaps_entry` function automatically parses all fields into a dictionary. You just need to use the new field when creating the `MemDetail` object.
225+
226+
```python
227+
# --- BEFORE ---
228+
# d = MemDetail(..., data['swap'], data['swappss'])
229+
230+
# --- AFTER ---
231+
d = MemDetail(
232+
data['size'], data['rss'], data['pss'], data['sharedclean'],
233+
data['shareddirty'], data['privateclean'], data['privatedirty'],
234+
data['referenced'], data['anonymous'], data['anonhugepages'],
235+
data['swap'], data['swappss'], data['locked'])
236+
```
237+
238+
After these two changes, re-running `read_smaps_batch.py` will produce processed files that include a `locked` column.
239+
240+
### Step 2: (Optional) Use the Field in the Analyzer (`analyze_smaps_logs.py`)
241+
242+
The analyzer will now have access to the `locked` data. To display it, you can add it to the text report. For example, to show the total change in locked memory:
243+
244+
```python
245+
# In analyze_logs in analyze_smaps_logs.py
246+
247+
# Add this block to the "Overall Total Memory Change" section
248+
if 'locked' in total_history and len(total_history['locked']) > 1:
249+
total_locked_change = total_history['locked'][-1] - total_history['locked'][0]
250+
print(f' Total Locked Change: {total_locked_change} kB')
251+
```
252+
253+
### Step 3: (Optional) Use the Field in the Visualizer (`visualize_smaps_analysis.py`)
254+
255+
To add the new field to the graph, you need to:
256+
257+
1. **Add the field to the JSON output in `analyze_smaps_logs.py`:**
258+
```python
259+
# In the 'regions' list comprehension inside analyze_logs:
260+
'regions': [{
261+
'name': name,
262+
'pss': data.get('pss', 0),
263+
'rss': data.get('rss', 0),
264+
'locked': data.get('locked', 0) # Add this line
265+
} for name, data in memory_data.items()]
266+
```
267+
268+
2. **Add a new plot in `visualize_smaps_analysis.py`:**
269+
For example, to add a line for "Total Locked" memory to the first chart:
270+
```python
271+
# In create_visualization in visualize_smaps_analysis.py:
272+
total_locked = [d['total_memory'].get('locked', 0) for d in data]
273+
ax1.plot(timestamps, total_locked, label='Total Locked', color='green')
274+
```
275+
276+
By following this pattern, you can incorporate any field from the raw smaps files into the entire toolchain.
277+
150278
## Testing
151279
152280
Unit tests are provided to ensure the functionality of the scripts. To run the tests, navigate to the project root directory and execute the following commands. Note that `__init__.py` handles Python path setup, so tests should always be run from the project root.
@@ -160,4 +288,7 @@ python3 -m unittest cobalt/tools/performance/smaps/read_smaps_test.py
160288
161289
# For analyze_smaps_logs.py
162290
python3 -m unittest cobalt/tools/performance/smaps/analyze_smaps_logs_test.py
291+
292+
# For visualize_smaps_analysis.py
293+
python3 -m unittest cobalt/tools/performance/smaps/visualize_smaps_analysis_test.py
163294
```

cobalt/tools/performance/smaps/analyze_smaps_logs.py

Lines changed: 69 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717

1818
import argparse
1919
from collections import defaultdict, OrderedDict
20+
import json
2021
import os
2122
import re
2223
import sys
@@ -77,7 +78,20 @@ def parse_smaps_file(filepath):
7778
# This will skip non-integer lines, like the repeated header
7879
continue
7980

80-
return memory_data, total_data
81+
# Second pass for aggregation
82+
aggregated_data = OrderedDict()
83+
shared_mem_total = defaultdict(int)
84+
for name, data in memory_data.items():
85+
if name.startswith('mem/shared_memory'):
86+
for field, value in data.items():
87+
shared_mem_total[field] += value
88+
else:
89+
aggregated_data[name] = data
90+
91+
if shared_mem_total:
92+
aggregated_data['[mem/shared_memory]'] = dict(shared_mem_total)
93+
94+
return aggregated_data, total_data
8195

8296

8397
def extract_timestamp(filename):
@@ -104,7 +118,7 @@ def get_top_consumers(memory_data, metric='pss', top_n=5):
104118
return sorted_consumers[:top_n]
105119

106120

107-
def analyze_logs(log_dir):
121+
def analyze_logs(log_dir, json_output_filepath=None):
108122
"""Analyzes a directory of processed smaps logs."""
109123
all_files = [
110124
os.path.join(log_dir, f)
@@ -119,13 +133,16 @@ def analyze_logs(log_dir):
119133

120134
print(f'Analyzing {len(all_files)} processed smaps files...')
121135

136+
# List to store structured data for JSON output
137+
analysis_data = []
138+
122139
# Store data over time for each memory region
123-
region_history = defaultdict(lambda: defaultdict(list))
124140
total_history = defaultdict(list)
125141

126142
first_timestamp = None
127143
last_timestamp = None
128144
last_memory_data = None
145+
first_memory_data = None
129146

130147
for filepath in all_files:
131148
filename = os.path.basename(filepath)
@@ -140,11 +157,22 @@ def analyze_logs(log_dir):
140157
print(f'Warning: {e}')
141158
continue
142159

160+
if first_memory_data is None:
161+
first_memory_data = memory_data
143162
last_memory_data = memory_data # Keep track of the last data
144163

145-
for region_name, data in memory_data.items():
146-
for metric, value in data.items():
147-
region_history[region_name][metric].append(value)
164+
current_snapshot = {
165+
'timestamp':
166+
timestamp,
167+
'total_memory':
168+
total_data,
169+
'regions': [{
170+
'name': name,
171+
'pss': data.get('pss', 0),
172+
'rss': data.get('rss', 0)
173+
} for name, data in memory_data.items()]
174+
}
175+
analysis_data.append(current_snapshot)
148176

149177
for metric, value in total_data.items():
150178
total_history[metric].append(value)
@@ -153,53 +181,57 @@ def analyze_logs(log_dir):
153181
print(f'Analysis from {first_timestamp} to {last_timestamp}')
154182
print('=' * 50)
155183

184+
# Output JSON data if requested
185+
if json_output_filepath:
186+
with open(json_output_filepath, 'w', encoding='utf-8') as f:
187+
json.dump(analysis_data, f, indent=2)
188+
print(f'JSON analysis saved to {json_output_filepath}')
189+
156190
# 1. Largest Consumers by the end log
157-
print('\nTop 5 Largest Consumers by the End Log (PSS):')
158-
top_pss = get_top_consumers(last_memory_data, metric='pss', top_n=5)
191+
print('\nOverall Total Memory Change:')
192+
print('\nTop 10 Largest Consumers by the End Log (PSS):')
193+
top_pss = get_top_consumers(last_memory_data, metric='pss', top_n=10)
159194
for name, data in top_pss:
160195
print(f" - {name}: {data.get('pss', 0)} kB PSS, "
161196
f"{data.get('rss', 0)} kB RSS")
162197

163-
print('\nTop 5 Largest Consumers by the End Log (RSS):')
164-
top_rss = get_top_consumers(last_memory_data, metric='rss', top_n=5)
198+
print('\nTop 10 Largest Consumers by the End Log (RSS):')
199+
top_rss = get_top_consumers(last_memory_data, metric='rss', top_n=10)
165200
for name, data in top_rss:
166201
print(f" - {name}: {data.get('rss', 0)} kB RSS, "
167202
f"{data.get('pss', 0)} kB PSS")
168203

169-
# 2. Top 5 Increases in Memory Over Time
170-
print('\nTop 5 Memory Increases Over Time (PSS):')
204+
# 2. Top 10 Increases in Memory Over Time
205+
print('\nTop 10 Memory Increases Over Time (PSS):')
171206
pss_growth = []
172-
for region_name, history in region_history.items():
173-
if 'pss' in history and len(history['pss']) > 1:
174-
growth = history['pss'][-1] - history['pss'][0]
207+
if last_memory_data and first_memory_data:
208+
all_keys = set(first_memory_data.keys()) | set(last_memory_data.keys())
209+
for r_name in all_keys:
210+
initial_pss = first_memory_data.get(r_name, {}).get('pss', 0)
211+
final_pss = last_memory_data.get(r_name, {}).get('pss', 0)
212+
growth = final_pss - initial_pss
175213
if growth > 0:
176-
pss_growth.append((region_name, growth))
214+
pss_growth.append((r_name, growth))
177215

178216
pss_growth.sort(key=lambda item: item[1], reverse=True)
179-
for name, growth in pss_growth[:5]:
217+
for name, growth in pss_growth[:10]:
180218
print(f' - {name}: +{growth} kB PSS')
181219

182-
print('\nTop 5 Memory Increases Over Time (RSS):')
220+
print('\nTop 10 Memory Increases Over Time (RSS):')
183221
rss_growth = []
184-
for region_name, history in region_history.items():
185-
if 'rss' in history and len(history['rss']) > 1:
186-
growth = history['rss'][-1] - history['rss'][0]
222+
if last_memory_data and first_memory_data:
223+
all_keys = set(first_memory_data.keys()) | set(last_memory_data.keys())
224+
for r_name in all_keys:
225+
initial_rss = first_memory_data.get(r_name, {}).get('rss', 0)
226+
final_rss = last_memory_data.get(r_name, {}).get('rss', 0)
227+
growth = final_rss - initial_rss
187228
if growth > 0:
188-
rss_growth.append((region_name, growth))
229+
rss_growth.append((r_name, growth))
189230

190231
rss_growth.sort(key=lambda item: item[1], reverse=True)
191-
for name, growth in rss_growth[:5]:
232+
for name, growth in rss_growth[:10]:
192233
print(f' - {name}: +{growth} kB RSS')
193234

194-
# Overall Total Memory Change
195-
print('\nOverall Total Memory Change:')
196-
if 'pss' in total_history and len(total_history['pss']) > 1:
197-
total_pss_change = total_history['pss'][-1] - total_history['pss'][0]
198-
print(f' Total PSS Change: {total_pss_change} kB')
199-
if 'rss' in total_history and len(total_history['rss']) > 1:
200-
total_rss_change = total_history['rss'][-1] - total_history['rss'][0]
201-
print(f' Total RSS Change: {total_rss_change} kB')
202-
203235

204236
def run_smaps_analysis_tool(argv=None):
205237
"""Parses arguments and runs the smaps log analysis."""
@@ -211,8 +243,12 @@ def run_smaps_analysis_tool(argv=None):
211243
type=str,
212244
help='Path to the directory containing processed smaps log files.')
213245

246+
parser.add_argument(
247+
'--json_output',
248+
type=str,
249+
help='Optional: Path to a file where JSON output will be saved.')
214250
args = parser.parse_args(argv)
215-
analyze_logs(args.log_dir)
251+
analyze_logs(args.log_dir, args.json_output)
216252

217253

218254
def main():

cobalt/tools/performance/smaps/analyze_smaps_logs_test.py

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -101,24 +101,21 @@ def test_parse_invalid_file(self):
101101
@patch('sys.stdout', new_callable=StringIO)
102102
def test_analyze_logs_output(self, mock_stdout):
103103
"""Tests the main analysis function and captures its output."""
104-
test_argv = [self.test_dir]
105-
analyze_smaps_logs.run_smaps_analysis_tool(test_argv)
104+
analyze_smaps_logs.analyze_logs(self.test_dir)
106105
output = mock_stdout.getvalue()
107106

108107
# Check for top consumers in the end log
109-
self.assertIn('Top 5 Largest Consumers by the End Log (PSS):', output)
108+
self.assertIn('Top 10 Largest Consumers by the End Log (PSS):', output)
110109
self.assertIn('- <lib_B>: 1500 kB PSS', output)
111110
self.assertIn('- <lib_A>: 1200 kB PSS', output)
112111

113112
# Check for memory growth
114-
self.assertIn('Top 5 Memory Increases Over Time (PSS):', output)
113+
self.assertIn('Top 10 Memory Increases Over Time (PSS):', output)
115114
self.assertIn('- <lib_B>: +1000 kB PSS', output)
116115
self.assertIn('- <lib_A>: +200 kB PSS', output)
117116

118117
# Check for overall change
119118
self.assertIn('Overall Total Memory Change:', output)
120-
self.assertIn('Total PSS Change: 1200 kB', output)
121-
self.assertIn('Total RSS Change: 1300 kB', output)
122119

123120
@patch('sys.stderr', new_callable=StringIO)
124121
def test_extract_timestamp_warning(self, mock_stderr):

0 commit comments

Comments
 (0)