-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Use embedded files also in installed builds #12000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
krobelus
wants to merge
62
commits into
fish-shell:master
Choose a base branch
from
krobelus:embed-always
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… are in sync Commit 1fe6b28 (rustfmt.toml: specify edition to allow 2024 syntax, 2025-10-19) mentions that "cargo fmt" has different behavior than "rustfmt" before that commit. Probably because when .rustfmt.toml exists, rustfmt implicitly uses a different edition (2018?) that doesn't support c"" yet. That commit bumped the edition to 2024, which caused yet another deviation from "cargo fmt": Error writing files: failed to resolve mod `tests`: cannot parse /home/johannes/git/fish-shell/src/wutil/tests.rs error: expected identifier, found reserved keyword `gen` --> /home/johannes/git/fish-shell/src/tests/topic_monitor.rs:48:9 | 48 | for gen in &mut gens_list { | ^^^ expected identifier, found reserved keyword This has since been fixed by 0078424 (Update to rust 2024 edition, 2025-10-22). Let's add a test so that such changes won't randomly break "rustfmt" again. Fix that by using 2021 edition, like we do in Cargo.toml. In future, rustfmt should probably default to a current edition (or maybe read the edition from Cargo.toml?) Not yet sure which one is the upstream issue, maybe rust-lang/rustfmt#5650
For historical reasons[^1], most of our Rust tests are in src/tests,
which
1. is unconventional (Rust unit tests are supposed to be either in the
same module as the implementation, or in a child module).
This makes them slightly harder to discover, navigate etc.
2. can't test private APIs (motivating some of the "exposed for
testing" comments).
Fix this by moving tests to the corresponding implementation file.
Reviewed with
git show $commit \
--color-moved=dimmed-zebra \
--color-moved-ws=allow-indentation-change
- Shared test-only code lives in
src/tests/prelude.rs,
src/builtins/string/test_helpers.rs
src/universal_notifier/test_helpers.rs
We might want to slim down the prelude in future.
- I put our two benchmarks below tests ("mod tests" followed by "mod bench").
Verified that "cargo +nightly bench --features=benchmark" still
compiles and runs. For some weird reason, the moved benchmarks throw
an unused-import warning though.
[^1]: Separate files are idiomatic in most other languages; also
separate files makes it easy to ignore when navigating the call graph.
("rg --vimgrep | rg -v tests/"). Fortunately, rust-analyzer provides
a setting called references.excludeTests for textDocument/references,
the textDocument/prepareCallHierarchy family, and potentially
textDocument/documentHighlight (which can be used to find all
references in the current file).
Closes fish-shell#11992
It failed for me on Linux. Same as abf3f50 (Relax test `test_dup2_during_select_ebadf` (fish-shell#11976), 2025-10-19).
Their msgfmt doesn't support --output-file=- yet, so use a temporary file if "msgfmt" doesn't support "--check-format". Else we should have GNU gettext which supports both. See fish-shell#11982
Try to match https://github.com/apple-oss-distributions/shell_cmds/blob/b33f386ac431f17d4799b5662f20ff2ddf0773b9/path_helper/path_helper.c 1. don't add empty segments 2. but always add an empty segment at the end See fish-shell#10684
Functions are always qualified with "libc::".
These are obsolete as of c8001b5 (encoding: use UTF-8 everywhere, 2025-10-18). The only place where we still read LC_CTYPE is wcwidth() (parent commit).
With this patch, our test driver unsets all LC_* variables and LANGUAGE, and sets LANG=C (because LANG=en would change quotes in the output). The only place where explicit Unicode support seems to be necessary is GNU sed called in the tests/checks/sphinx-markdown-changelog.fish test. In future we might want to avoid these issues by setting LANG=C.UTF-8 (or even LANG=en_US.UTF-8) instead of LANG=C, but for now, this shows that our fish/python scripts use UTF-8 regardless of environment.
Link to history, printf and "builtin _" which are the only(?) users of LC_TIME, LC_NUMERIC and LC_MESSAGES respectively (besides the core equivalent of "builtin _").
These include variables that are ignored by fish, which seems questionable for __fish_set_locale? The the effect seems benign because __fish_set_locale will do nothing if any of those variables is defined. Maybe we can get rid of this function eventually.
We may have used LC_COLLATE in the past via libc functions but I don't think we do today.
* since c8001b5 (encoding: use UTF-8 everywhere, 2025-10-18) we always use UTF-8, which simplifies docs. * emphasize that we (as of an earlier commit) document only the locale variables actually used by fish. We could change this in future, as long as the docs make it obvious whether it's about fish or external programs. * make things a bit more concise * delete a stale comment - missing encoding support is no longer a problem
Commit 8dc3982 (Always use LC_NUMERIC=C internally (fish-shell#8204), 2021-10-13) made us use LC_NUMERIC=C internally, to make C library functions behave in a predictable way. We no longer use library functions affected by LC_NUMERIC. (Except for snprintf in tests). Since the effective value of LC_NUMERIC no longer matters, let's simplify things by using the user locale again, like we do for the other locale variables.
Commit 046db09 (Try to set LC_CTYPE to something UTF-8 capable (fish-shell#8031), 2021-06-06) forced UTF-8 encoding if available. Since c8001b5 (encoding: use UTF-8 everywhere, 2025-10-18) we no longer override LC_CTYPE, since we no longer use it for anything like mbrtowc or iswalpha. However there is one remaining use: our fallbacks to system wcwidth(). If we are started as LC_ALL=C fish then wcwidth('😃') will fail to return the correct value, even if the UTF-8 locale would have been available. Restore the previous behavior, so locale variables don't affect emoji width. This is consistent with the direction in c8001b5 which stops us from falling back to ASCII characters if our desired multibyte locale is missing (that was not the ideal criteria). In future we should maybe stop using wcwidth().
We use the following locale-variables via locale-aware C functions (you can look them up by grepping for "libc::$function_name"): - LC_CTYPE: wcwidth - LC_MESSAGES: strerror, strerror - LC_NUMERIC: localeconv/localeconv_l, snprintf (only in tests) - LC_TIME: strftime Additionally, we interpret LC_MESSAGES in our own code. As of today, the PCRE2 library does not seem to use LC_MESSAGES (error messages are English-only); and I don't think it uses LC_CTYPE unless we use "pcre2_maketables()". Let's make it more obvious which locale categories are actually used by setting those instead of LC_ALL. This means that instead of logging the return value of « setlocale(LC_ALL, "") » (which is an opaque binary string on Linux), we can log the actual per-category locales that are in effect. This means that there's no longer really a reason to hardcode a log for the LC_MESSAGES locale. We're not logging at early startup but we will log as soon as any locale var is detected as set.
Since c8001b5 (encoding: use UTF-8 everywhere, 2025-10-18), a change in locale variables can no longer cause us to toggle between "…" or "...". Have the control flow reflect this.
When "status fish-path" fails, we fall back to argv[0], hoping that it contains an absolute path to fish, or at least is equal to "fish" (which can happen on OpenBSD, see rust-lang/rust#60560 (comment)). I don't think it makes sense to duplicate logic that's probably already in std::env::current_exe().
Commit 7b59ae0 (Unbreak hack to strip " (deleted)" suffix from executable path, 2025-10-02) accidentally droped the "!".
This value is either "fish" or an absolute path, it can't be empty.
This was used in fish_tests.cpp 754b78a (Suppress certain stderr-printing during tests, 2016-12-03) which is no more. I haven't noticed this specific outptu in "cargo test" output.
As mentioned earlier commits, "status fish-path" is either an absolute path or "fish". At startup, we try to canonicalize this path. This is wrong if "status fish-path" prints "fish" -- we'll do subtly wrong things if that file happens to exist in the current working directory.
If fish is invoked as
execve("/bin/fish", "fish")
on OpenBSD, "status fish-path" will output "fish". As a result,
our wrapper for fish_indent tries to execute "./fish" if it exists.
We actually no longer need to call any executable, since "fish_indent"
is available as builtin now.
Stop using the wrapper, fixing the OpenBSD issue.
awk is mawk on Ubuntu.
In future, we should ask "renovatebot" to update these version. I don't have an opinion on whether to use "uv" or something else, but I think we do want lockfiles. No particular reason for this Python version. Somehow tests/checks/sphinx-markdown-changelog.fish doesn't seem to have run in CI before this (or possibly another change). Anway, it's failing on GitHub Actions because no tags are available (shallow clone). Disable that part of the this test on CI for now; the most important part is the RST->Markdown conversion and syntax check. Part of fish-shell#11996
The new automation workflow doesn't sign the Git tag or the tarball as we used to. Since the tag is still created locally, sign it directly. For signing the tarball; build it locally and compare the extracted tarball with the one produced by GitHub. Closes fish-shell#11996
Google cloud CLI ships >10k man pages for subcommands, but the completions are not useful because they don't know to replace underscores by spaces, e.g. in: complete -c gcloud_access-approval_requests_approve
Enables the next commit.
This helps the next commit.
On first startup, we run this script, but not for standalone (embed-data) builds. Rectify that.
Seems to save only 1KiB in binary size (might be noise).
It's always the CMake output directory, so call it FISH_CMAKE_BINARY_DIR. It's possible to set it via some other build system but if such builds exist, they are likely subtly broken, or at least with the following commit which adds the assumption that "share/__fish_build_paths.fish.in" exists in this directory. We could even call it CMAKE_BINARY_DIR but let's namespace it to make our use more obvious. Also, stop using the $CMAKE environment variable, it's not in our namespace.
Today, this file is only supported in CMake builds. This is the last place where CMake builds use $__fish_data_dir. Let's embed it so we can treat it like other assets. This enables users to do "rm -rf $__fish_data_dir" (though that might break plugins).
According to https://cmake.org/cmake/help/latest/command/install.html default paths look like Type variable default INCLUDE ${CMAKE_INSTALL_INCLUDEDIR} include SYSCONF ${CMAKE_INSTALL_SYSCONFDIR} etc DATA ${CMAKE_INSTALL_DATADIR} <DATAROOT dir> MAN ${CMAKE_INSTALL_MANDIR} <DATAROOT dir>/man so "etc" is supposed to be a sibling of "include", not a child of DATA (aka "share/"). This allows us to remove a hack where we needed to know the data dir in non-standalone builds.
This matches the convention used by Make/CMake and others.
Some packagers such as Homebrew use the "relocatable-tree" logic,
i.e. "fish -d config" shows paths like:
/usr/local/etc/fish
/usr/local/share/fish
Other packagers use
/etc/fish
/usr/share/fish
which we don't treat as relocatable tree,
because "/usr/etc/fish" does **not** exists.
To get embed-data in shape for being used as a default for installed
builds, we want to support both cases.
Today, embed-data builds only handle the "in-build-dir" logic.
Teach them also about the relocatable-tree logic. This will make
embed-data builds installed via Homebrew (i.e. CMake) use the correct
sysconfdir ("/usr/local/etc").
We use absence of "$__fish_data_dir[1]" as criteria to use the "standalone" code paths that use "status list-files/get-file" instead of $__fish_data_dir. However, third party software seems slow to react to such breaking changes, see ajeetdsouza/zoxide#1045 So keep $__fish_data_dir for now to give plugins more time. This commit makes us ignore $__fish_data_dir on standalone builds even if defined; a following commit will actually define it again. I guess the approach in b815fff (Set $__fish_data_dir to empty for embed-data builds, 2025-04-01) made sense back when we didn't anticipate switching to standalone builds by default yet.
To prepare for defining __fish_data_dir/__fish_help_dir again on embed-data builds, reverts commit 04a2398. No functional change.
Implied by bee1e12 (Remove unused locale path code, 2025-09-16).
While at it, reorder the assignments to match the order in ConfigPaths.
As mentioned in a previous commit ("Don't special-case __fish_data_dir
in standalone builds"), we want to enable embed-data by default.
To reduce the amount of breakage, we'll keep installing files,
and keep defining those variables at least for some time.
For example, this will allow installed "standalone" builds to use
local docs via our "help" function.
When users update fish by replacing files, existing shells might throw weird errors because internal functions in share/functions/ might have changed. This is easy to remedy by restarting the shell, but I guess it would be nice if old shells kept using old data. This is somewhat at odds with lazy-loading. But we can use the standalone mode. Turn that on by default. Additionally, this could simplify packaging. Some packages incorrectly third party files into /usr/share/fish/completion/ when they ought to use /usr/share/fish/vendor_completions.d/ for that. That packaging mistake can make file conflicts randomly appearing when either fish or foo ships a completion for foo. With this patch, it no longer causes conflicts but will silently not work, for better or worse. The only advantage of having /usr/share/fish/ on the file system is discoverability. But the full source code is better. Note that we still install (redundant) $__fish_data_dir when using CMake. This is to not unnecessarily break both 1. already running (old) shells as users upgrade to 4.2 2. plugins (see parent commit), We can stop installing $__fish_data_dir once we expect 99% of users to upgrade from at least 4.2. Closes fish-shell#11921 To-do: - maybe make it a feature flag so users can turn it off without recompiling with "set -Ua no-embed-data".
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Turn on the "embed-data" feature by default, to load our functions/completions directly from the fish binary.
Mainly to remove weird behavior in running shells when upgrading (#11921),
but also as a part of the effort to reduce differences between standalone and installed mode.
I wonder if we should already remove the ability to turn off
embed-data.(Draft state; this PR still includes some unrelated changes;
happy to accomodate review better but I still need to do self-review)