How to use {fmt} with large data

Question

I'm starting to play with {fmt} and wrote a little program to see how it processes large containers. It would seem that fmt::print() (which ultimately sends output to stdout) internally first composes the entire result as a string. The test program below where I format a 10,000,000 sized vector<char> using a format string that consumes 100 bytes per entry amasses the full 100 * 10,000,000 = 1 GB of RAM before starting to dump the result to stdout. Although you can't tell from the output of my test program, almost all of the 1.7 seconds it took to format and output the result is spent in the formatting -- not the outputting. (If you don't redirect to /dev/null, there's a long pause before anything starts printing to stdout.) This is not good behavior if you're trying to build pipelining tools.

Q1. I do see some references in the docs to fmt::format_to(). Can that somehow be used to start streaming and discarding the result before the formatting is complete and thereby avoid the in-core composition of the full result?

Q2. Continuing along this line of exploration, instead of passing a container, is there a way I can pass, say, two iterators (that perhaps point at the beginning and ending of a very large file) and pump that data through {fmt} for processing (and thereby avoid having to first read the entire file into memory)?

#include <iostream>
#include <vector>
#include "fmt/format.h"
#include "fmt/ranges.h"
#include "time.h"

using namespace std;

inline long long
clock_monotonic_raw() {
    struct timespec ct;
    clock_gettime(CLOCK_MONOTONIC_RAW, &ct);
    return ct.tv_sec * 1000000000LL + ct.tv_nsec;
}

inline double
dt() {
    static long long t0 = 0;
    if (t0 == 0) {
        t0 = clock_monotonic_raw();
        return 0.0;
    }
    long long t1 = clock_monotonic_raw();
    return (t1 - t0) / 1.0e9;
}

int main(int argc, char** argv) {
    fprintf(stderr, "%10.6f: ENTRY\n", dt());
    vector<char> v;
    for (int i = 0; i < 10'000'000; ++i)
        v.push_back('A' + i % 26);
    string pad(98, ' ');
    fprintf(stderr, "%10.6f: INIT\n", dt());
    fmt::print(pad + "{}\n", fmt::join(v, "\n" + pad));
    fprintf(stderr, "%10.6f: DONE\n", dt());
    return 0;
}

matt@dworkin:fmt_test$ g++ -o mem_fmt -O3 -I ../fmt/include/ mem_fmt.cpp ../fmt/libfmt.a
matt@dworkin:fmt_test$ ./mem_fmt > /dev/null
  0.000000: ENTRY
  0.034582: INIT
  1.769687: DONE

[from another window whilst it's running]

matt@dworkin:fmt_test$ ps -aux | egrep 'COMMAND|mem_fmt' | grep -v grep
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
matt       30292  2.8  6.2 1097864 999208 pts/0  S+   17:40   0:01 ./mem_fmt

Note VSZ of 1.097864 GB

I've found that fmt::format_to(ostream_iterator<char>(std::cout), pad + "{}\n", fmt::join(v, "\n" + pad)); solves the memory problem at the cost of a 20x performance hit. Perhaps some simpler iterator that either writes directly to stdout or perhaps to some 4K buffer that dumps to stdout in chunks could be devised. I would still like to know if someone can find a better solution to my first question. And find any solution to my second question. — Matthew Busche
– Matthew Busche, Commented May 9, 2024 at 2:15

vitaut · Accepted Answer · 2024-05-10 16:17:45Z

3

First, let's address your example. The current version of {fmt} has an optimization that allows writing directly into a stream buffer. Right now it is only enabled for fundamental and string types. Once enabled for join_view in this commit, no additionally dynamic memory will be allocated in your example, fmt::print will just use the C stream buffer.

Unlike the ostream_iterator approach it will also be faster.

Before:

% time ./a.out > /dev/null
...
./a.out > /dev/null  0.23s user 0.38s system 71% cpu 0.857 total

After:

% time ./a.out > /dev/null
...
./a.out > /dev/null  0.12s user 0.01s system 96% cpu 0.135 total

This optimization is also proposed (and accepted) for std::print in P3107R5 Permit an efficient implementation of std::print.

In older versions of {fmt} you can just replace fmt::join with writing lines individually, fmt::join provides no benefit in your case anyway.

Now to the questions:

Q1. I do see some references in the docs to fmt::format_to(). Can that somehow be used to start streaming and discarding the result before the formatting is complete and thereby avoid the in-core composition of the full result?

Yes. In general formatting functions including format_to write into a fixed-size buffer (print was an exception but it is being fixed as described above). They might still need to allocate for a single argument (but not the full output) if you use padding.

Q2. Continuing along this line of exploration, instead of passing a container, is there a way I can pass, say, two iterators (that perhaps point at the beginning and ending of a very large file) and pump that data through {fmt} for processing (and thereby avoid having to first read the entire file into memory)?

Yes. {fmt} iterates over a range element by element and supports single-pass input iterators. So you can read lazily and discard parts of the input after they have been consumed to save memory. Iterators can be passed as part of a range or via fmt::join.

edited May 10, 2024 at 16:17

answered May 10, 2024 at 0:19

vitaut

57.5k31 gold badges232 silver badges411 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Matthew Busche Over a year ago

So you're saying I just need to download the latest version? And do you have any insight on my 2nd question? format_to lets you set an output iterator, but I didn't notice an interface that allows an iterator for input. It would be nice to pipeline through a format.

vitaut Over a year ago

Answered the questions more directly. Was a bit in a hurry yesterday and only focused on the example. HTH

Collectives™ on Stack Overflow

How to use {fmt} with large data

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related