Skip to content

Conversation

@kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented Dec 19, 2022

Due to allocation failures, or invalid content in dictionary,
"params" entries in "struct proc_ldc_entry" may be NULL.

In print_entry_params(), the NULL entries may be passed
as arguments to fprintf(). While e.g. glibc handles these without
error, this is not guaranteed behaviour and may result in segfault
on some platforms.

Fix the issue by replacing any NULL entries in process_params()
and separately handling the NULL case in dump_ldc_info().

Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com

@kv2019i
Copy link
Collaborator Author

kv2019i commented Dec 19, 2022

V2:

  • oops, compiler did not like "stdderr", fixed. thanks @ranj063 for spotting

Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see format_uid_raw() being used in at least one other place.

Morever format_uid_raw() is not even calling malloc() directly, there is more indirection.

=> Please fix log_vasprintf() once for all instead; that is the function not checking the malloc() return value.

@marc-hb
Copy link
Collaborator

marc-hb commented Dec 19, 2022

Also, please use abort() instead of exit() in cases like this because it creates a useful coredump. Demo:

--- a/tools/logger/logger.c
+++ b/tools/logger/logger.c
@@ -86,6 +86,8 @@ static void usage(void)
        fprintf(stdout, "%s:\t -F filter\t\tUpdate trace filter, format: "
                "<level>=<comp1>[, <comp2>]\n",
                APP_NAME);
+
+       abort();
        exit(0);
 }
./tools/build_tools/logger/sof-logger   


Aborted (core dumped)

coredumpctl list
TIME                           PID  UID  GID SIG     COREFILE EXE                                                             SIZE
Mon 2022-12-19 23:45:19 UTC 236154 1005 1005 SIGABRT present  /home/me/SOF/sof/tools/build_tools/logger/sof-logger 16.8K

coredumpctl gdb 236154

(gdb) backtrace
#4  0x0000563e885ef7cb in usage () at /home/me/SOF/sof/tools/logger/logger.c:90
#5  0x0000563e885f02b4 in main (argc=2, argv=0x7fff61f5dea8) at /home/me/SOF/sof/tools/logger/logger.c:392

@marc-hb
Copy link
Collaborator

marc-hb commented Dec 20, 2022

abort() is debug-friendly on Windows too:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/abort?view=msvc-170

When the program is compiled in debug mode, the message box displays options to Abort, Retry, or Ignore. If the user chooses Abort, the program terminates immediately and returns an exit code of 3. If the user chooses Retry, a debugger is invoked for just-in-time debugging, if available

@kv2019i
Copy link
Collaborator Author

kv2019i commented Dec 20, 2022

@marc-hb wrote:

@@ -86,6 +86,8 @@ static void usage(void)
fprintf(stdout, "%s:\t -F filter\t\tUpdate trace filter, format: "
"=[, ]\n",
APP_NAME);
+

  •   abort();
    

Coredumping when wrong option passed is not really a good option right (in usage()). UPDATE: I guess this was just for the demo, right.

I'm not a fan of abort() in application code, it really should be reserved to system library / runtime code in most cases.

Let me check if I can clean this up.

@kv2019i
Copy link
Collaborator Author

kv2019i commented Dec 20, 2022

@marc-hb wrote:

I see format_uid_raw() being used in at least one other place.

Morever format_uid_raw() is not even calling malloc() directly, there is more indirection.

=> Please fix log_vasprintf() once for all instead; that is the function not checking the malloc() return value.

I looked at this again and misc.c is clearly designed to handle out of memory cases, so I don't think it's appropriate to abort there.

In convert.c, "format_uid_raw() -> log_asprintf()", "asprintf_entry_text()" and "asprintf_uuid()" are all used to fill "pe->params[i]" which is later fed to printf().

On most systems this actually is safe as "(null)" is printed, but I guess this is not universally true and some ANSI C runtimes may segfault in this case. So I think the correct fix is to protect against printing a NULL. Let me modify the patch accordingly.

@kv2019i kv2019i force-pushed the 202212-soflogger-uidraw-fail branch from 6b34135 to 9f4805a Compare December 20, 2022 13:17
@kv2019i kv2019i changed the title sof-logger: exit with error if format_uid_raw() fails sof-logger: ensure NULL is not passed to printf/fprintf Dec 20, 2022
@kv2019i kv2019i requested a review from marc-hb December 20, 2022 13:18
@kv2019i kv2019i added the P1 Blocker bugs or important features label Dec 20, 2022
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could just use a single static const char missing[] = "<missing>"; string and not set subst_mask

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh Actually yes, that would be potentially cleaner. This still requires mucking with the subst_mask (which is a bit iffy as in theory subst_mask is not just to indicate how the object was allocated). Let me update the patch.

@kv2019i kv2019i force-pushed the 202212-soflogger-uidraw-fail branch from 9f4805a to cee70a7 Compare December 20, 2022 17:55
@kv2019i
Copy link
Collaborator Author

kv2019i commented Dec 20, 2022

V3 update:

@kv2019i kv2019i requested a review from lyakh December 20, 2022 18:09
@kv2019i kv2019i force-pushed the 202212-soflogger-uidraw-fail branch from cee70a7 to 2e7f445 Compare December 20, 2022 20:29
@kv2019i
Copy link
Collaborator Author

kv2019i commented Dec 20, 2022

V4:

  • dropped the no-longer-needed check in V3, no other changes

@marc-hb
Copy link
Collaborator

marc-hb commented Dec 20, 2022

https://sof-ci.01.org/sofpr/PR6848/build3091/devicetest/index.html has a worryingly large number of failures but they're all unrelated - sof-logger is not tested anymore on the main branch

@marc-hb
Copy link
Collaborator

marc-hb commented Dec 21, 2022

I looked at this again and misc.c is clearly designed to handle out of memory cases, so I don't think it's appropriate to abort there.

I think you're making this much more complicated than it needs to be.

So we have f1() and f2() calling log_vasprintf() which invokes malloc() and does something stupid when malloc returns NULL.

I think you're saying "f1() does not trust log_vasprintf(), so let's make sure f2() does not either".

If malloc() starts failing in this tool (in practice: never) then all bets are off and log_vasprintf() should abort() ASAP. One-line fix.

@kv2019i
Copy link
Collaborator Author

kv2019i commented Dec 21, 2022

@marc-hb wrote:

I think you're making this much more complicated than it needs to be.
[...]
If malloc() starts failing in this tool (in practice: never) then all bets are off and log_vasprintf() should abort() ASAP. One-line fix.

It's not just malloc fails, but asprintf_entry_text() (calling read_entry_from_ldc_file()) may return NULL for non-alloc failures as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels sketchy. Do we even support strings in the parameters in the first place? I thought only stuff like %d (%x, %o) and %p are supported in parameters. I feel like this is a correctness issue. But then you would also have to scan the format string to "ban" the %s specifier which is more complicated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strings not supported today and unlikely to be since this is all free with the Zephyr logger.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulstelian97 we do support strings. Here's example print for params when subst_mask is set:

params[0] 0x9E19E370 (subst_mas clear)
params[1] 'dma-trace-task <2b972272-c5b1-4b7e-926f-0fc5cb4c4690>' (subst_mask set)

And matching log output:

[         131.770828] (         131.770828) c0 ll-schedule        ./schedule/ll_schedule.c:513  INFO task add 0x9e19e370 dma-trace-task <2b972272-c5b1-4b7e-926f-0fc5cb4c4690>

It's still sketchy. Let me see if I can do a simpler change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused how strings can be supported because sof-logger uses dictionary logging and I'm not aware of any support for strings in there. Zephyr logging I think is not based on dictionary logging by default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulstelian97 I think Zephyr now has the string dictionary support, initially all the strings were baked into the Zephyr image (sorry, I don't know the Zephyr internals here). xtos config does not support %s strings as we cant do stuff like __func__ in our trace today, only strings are in the trace body via the dictionary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I want to learn about the Zephyr logger and see if (after NXP migrates to fully Zephyr) we can use that. Right now we still have to use sof-logger due to still keeping the xtos builds (Zephyr isn't fully stable for us, there's a good few quirks left to fix)

Copy link
Collaborator Author

@kv2019i kv2019i Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulstelian97 @lgirdwood There's still code to handle "%s" but this is not used by any code in SOF FW and sof-logger emits a warning if one is seen. But we have two indirect usages of "%s", FW traces with "%pUx" and "%pQ" are converted to "%s" and the argument passed from FW (in raw log stream) is converted to a string that is passed as argument to (f)printf. And these steps can hit an alloc failure, or just missing conversion, leading to NULL being used.

But right, "plain %s" usage where you'd have %s used directly in FW code, is not there.

Copy link
Contributor

@jsarha jsarha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Ok. Then again being scared about a null parameter for format string and dealing with risking heap corruption is counter productive, as printf deals gracefully with NULL and even if it did not, finding the error would be easy. In case of heap corruption its not. But this is what it is these static analysis errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixing malloced pointers with static pointers is a bit scary, but I guess subst_mask is enough to make sure free is never called for "" .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsarha Just a note: the code is already doing that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, looks good then.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets fix the potential NULL issue today, we can always come back and circle back on other updates if needed.

@kv2019i kv2019i changed the title sof-logger: ensure NULL is not passed to printf/fprintf [DNM] sof-logger: ensure NULL is not passed to printf/fprintf Dec 21, 2022
@kv2019i kv2019i force-pushed the 202212-soflogger-uidraw-fail branch from 2e7f445 to adb3466 Compare December 21, 2022 10:55
@kv2019i kv2019i changed the title [DNM] sof-logger: ensure NULL is not passed to printf/fprintf sof-logger: ensure NULL is not passed to printf/fprintf Dec 21, 2022
@kv2019i kv2019i force-pushed the 202212-soflogger-uidraw-fail branch from adb3466 to fa81f62 Compare December 21, 2022 10:58
@kv2019i
Copy link
Collaborator Author

kv2019i commented Dec 21, 2022

V5 update:

  • rebased and simplified implementation to abort on plain alloc fails and only handle the ldc entry lookup fail with explicit error handling

Due to allocation failures, or invalid content in dictionary,
"params" entries in "struct proc_ldc_entry" may be NULL.

In print_entry_params(), the NULL entries may be passed
as arguments to fprintf(). While e.g. glibc handles these without
error, this is not guaranteed behaviour and may result in segfault
on some platforms.

Fix the issue by aborting program if allocation fails and
explicitly handling the cases when asprintf_entry_text returns
NULL.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i kv2019i force-pushed the 202212-soflogger-uidraw-fail branch from fa81f62 to fce89c9 Compare December 21, 2022 11:00
pe->subst_mask |= 1 << i;
else
pe->params[i] = (uintptr_t)missing;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch is now only used for one trace in SOF:

./trace/trace.c:139:                 _TRACE_INV_ID, _TRACE_INV_ID, "Suppressed %u similar messages: %pQ",

But still, for correctness, the entry may not be found.

@lgirdwood lgirdwood merged commit 1a7a36a into thesofproject:main Dec 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P1 Blocker bugs or important features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants