Skip to content

Commit 93b73b0

Browse files
committed
journal: by default do not decompress dat objects larger than 64K
This introduces a new data threshold setting for sd_journal objects which controls the maximum size of objects to decompress. This is relieves the library from having to decompress full data objects even if a client program is only interested in the initial part of them. This speeds up "systemd-coredumpctl" drastically when invoked without parameters.
1 parent f2cf040 commit 93b73b0

File tree

13 files changed

+135
-25
lines changed

13 files changed

+135
-25
lines changed

man/sd_journal_get_data.xml

Lines changed: 54 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@
4747
<refname>sd_journal_enumerate_data</refname>
4848
<refname>sd_journal_restart_data</refname>
4949
<refname>SD_JOURNAL_FOREACH_DATA</refname>
50+
<refname>sd_journal_set_data_threshold</refname>
51+
<refname>sd_journal_get_data_threshold</refname>
5052
<refpurpose>Read data fields from the current journal entry</refpurpose>
5153
</refnamediv>
5254

@@ -81,6 +83,17 @@
8183
<paramdef>size_t <parameter>length</parameter></paramdef>
8284
</funcprototype>
8385

86+
<funcprototype>
87+
<funcdef>int <function>sd_journal_set_data_threshold</function></funcdef>
88+
<paramdef>sd_journal* <parameter>j</parameter></paramdef>
89+
<paramdef>size_t <parameter>sz</parameter></paramdef>
90+
</funcprototype>
91+
92+
<funcprototype>
93+
<funcdef>int <function>sd_journal_get_data_threshold</function></funcdef>
94+
<paramdef>sd_journal* <parameter>j</parameter></paramdef>
95+
<paramdef>size_t* <parameter>sz</parameter></paramdef>
96+
</funcprototype>
8497
</funcsynopsis>
8598
</refsynopsisdiv>
8699

@@ -102,7 +115,11 @@
102115
<function>sd_journal_enumerate_data()</function>, or
103116
the read pointer is altered. Note that the data
104117
returned will be prefixed with the field name and
105-
'='.</para>
118+
'='. Also note that by default data fields larger than
119+
64K might get truncated to 64K. This threshold may be
120+
changed and turned off with
121+
<function>sd_journal_set_data_threshold()</function> (see
122+
below).</para>
106123

107124
<para><function>sd_journal_enumerate_data()</function>
108125
may be used to iterate through all fields of the
@@ -128,6 +145,32 @@
128145
<citerefentry><refentrytitle>sd_journal_next</refentrytitle><manvolnum>3</manvolnum></citerefentry>
129146
(or related call) has been called at least
130147
once, in order to position the read pointer at a valid entry.</para>
148+
149+
<para><function>sd_journal_set_data_threshold()</function>
150+
may be used to change the data field size threshold
151+
for data returned by
152+
<function>sd_journal_get_data()</function>,
153+
<function>sd_journal_enumerate_data()</function> and
154+
<function>sd_journal_enumerate_unique()</function>. This
155+
threshold is a hint only: it indicates that the client
156+
program is interested only in the initial parts of the
157+
data fields, up to the threshold in size -- but the
158+
library might still return larger data objects. That
159+
means applications should not rely exclusively on this
160+
setting to limit the size of the data fields returned,
161+
but need to apply a explicit size limit on the
162+
returned data as well. This threshold defaults to 64K
163+
by default. To retrieve the complete data fields this
164+
threshold should be turned off by setting it to 0, so
165+
that the library always returns the complete data
166+
objects. It is recommended to set this threshold as
167+
low as possible since this relieves the library from
168+
having to decompress large compressed data objects in
169+
full.</para>
170+
171+
<para><function>sd_journal_get_data_threshold()</function>
172+
returns the currently configured data field size
173+
threshold.</para>
131174
</refsect1>
132175

133176
<refsect1>
@@ -144,15 +187,22 @@
144187
read, 0 when no more fields are known, or a negative
145188
errno-style error
146189
code. <function>sd_journal_restart_data()</function>
147-
returns nothing.</para>
190+
returns
191+
nothing. <function>sd_journal_set_data_threshold()</function>
192+
and <function>sd_journal_get_threshold()</function>
193+
return 0 on success or a negative errno-style error
194+
code.</para>
148195
</refsect1>
149196

150197
<refsect1>
151198
<title>Notes</title>
152199

153200
<para>The <function>sd_journal_get_data()</function>,
154-
<function>sd_journal_enumerate_data()</function> and
155-
<function>sd_journal_restart_data()</function>
201+
<function>sd_journal_enumerate_data()</function>,
202+
<function>sd_journal_restart_data()</function>,
203+
<function>sd_journal_set_data_threshold()</function>
204+
and
205+
<function>sd_journal_get_data_threshold()</function>
156206
interfaces are available as shared library, which can
157207
be compiled and linked to with the
158208
<literal>libsystemd-journal</literal>

man/sd_journal_query_unique.xml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,9 @@
113113
invocation of
114114
<function>sd_journal_enumerate_unique()</function>. Note
115115
that the data returned will be prefixed with the field
116-
name and '='.</para>
116+
name and '='. Note that this call is subject to the
117+
data field size threshold as controlled by
118+
<function>sd_journal_set_data_threshold()</function>.</para>
117119

118120
<para><function>sd_journal_restart_unique()</function>
119121
resets the data enumeration index to the beginning of

src/journal/compress.c

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
#include <string.h>
2525
#include <lzma.h>
2626

27+
#include "macro.h"
2728
#include "compress.h"
2829

2930
bool compress_blob(const void *src, uint64_t src_size, void *dst, uint64_t *dst_size) {
@@ -66,10 +67,11 @@ bool compress_blob(const void *src, uint64_t src_size, void *dst, uint64_t *dst_
6667
}
6768

6869
bool uncompress_blob(const void *src, uint64_t src_size,
69-
void **dst, uint64_t *dst_alloc_size, uint64_t* dst_size) {
70+
void **dst, uint64_t *dst_alloc_size, uint64_t* dst_size, uint64_t dst_max) {
7071

7172
lzma_stream s = LZMA_STREAM_INIT;
7273
lzma_ret ret;
74+
uint64_t space;
7375
bool b = false;
7476

7577
assert(src);
@@ -98,7 +100,8 @@ bool uncompress_blob(const void *src, uint64_t src_size,
98100
s.avail_in = src_size;
99101

100102
s.next_out = *dst;
101-
s.avail_out = *dst_alloc_size;
103+
space = dst_max > 0 ? MIN(*dst_alloc_size, dst_max) : *dst_alloc_size;
104+
s.avail_out = space;
102105

103106
for (;;) {
104107
void *p;
@@ -111,18 +114,23 @@ bool uncompress_blob(const void *src, uint64_t src_size,
111114
if (ret != LZMA_OK)
112115
goto fail;
113116

114-
p = realloc(*dst, *dst_alloc_size*2);
117+
if (dst_max > 0 && (space - s.avail_out) >= dst_max)
118+
break;
119+
120+
p = realloc(*dst, space*2);
115121
if (!p)
116122
goto fail;
117123

118124
s.next_out = (uint8_t*) p + ((uint8_t*) s.next_out - (uint8_t*) *dst);
119-
s.avail_out += *dst_alloc_size;
125+
s.avail_out += space;
126+
127+
space *= 2;
120128

121129
*dst = p;
122-
*dst_alloc_size *= 2;
130+
*dst_alloc_size = space;
123131
}
124132

125-
*dst_size = *dst_alloc_size - s.avail_out;
133+
*dst_size = space - s.avail_out;
126134
b = true;
127135

128136
fail:

src/journal/compress.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
bool compress_blob(const void *src, uint64_t src_size, void *dst, uint64_t *dst_size);
2828

2929
bool uncompress_blob(const void *src, uint64_t src_size,
30-
void **dst, uint64_t *dst_alloc_size, uint64_t* dst_size);
30+
void **dst, uint64_t *dst_alloc_size, uint64_t* dst_size, uint64_t dst_max);
3131

3232
bool uncompress_startswith(const void *src, uint64_t src_size,
3333
void **buffer, uint64_t *buffer_size,

src/journal/coredumpctl.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -342,6 +342,11 @@ static int dump_list(sd_journal *j) {
342342

343343
assert(j);
344344

345+
/* The coredumps are likely to compressed, and for just
346+
* listing them we don#t need to decompress them, so let's
347+
* pick a fairly low data threshold here */
348+
sd_journal_set_data_threshold(j, 4096);
349+
345350
SD_JOURNAL_FOREACH(j) {
346351
if (field)
347352
print_field(stdout, j);
@@ -381,6 +386,9 @@ static int dump_core(sd_journal* j) {
381386

382387
assert(j);
383388

389+
/* We want full data, nothing truncated. */
390+
sd_journal_set_data_threshold(j, 0);
391+
384392
r = focus(j);
385393
if (r < 0)
386394
return r;
@@ -428,6 +436,8 @@ static int run_gdb(sd_journal *j) {
428436

429437
assert(j);
430438

439+
sd_journal_set_data_threshold(j, 0);
440+
431441
r = focus(j);
432442
if (r < 0)
433443
return r;

src/journal/journal-file.c

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -780,7 +780,7 @@ int journal_file_find_data_object_with_hash(
780780

781781
l -= offsetof(Object, data.payload);
782782

783-
if (!uncompress_blob(o->data.payload, l, &f->compress_buffer, &f->compress_buffer_size, &rsize))
783+
if (!uncompress_blob(o->data.payload, l, &f->compress_buffer, &f->compress_buffer_size, &rsize, 0))
784784
return -EBADMSG;
785785

786786
if (rsize == size &&
@@ -2591,7 +2591,6 @@ int journal_file_open_reliably(
25912591
metrics, mmap_cache, template, ret);
25922592
}
25932593

2594-
25952594
int journal_file_copy_entry(JournalFile *from, JournalFile *to, Object *o, uint64_t p, uint64_t *seqnum, Object **ret, uint64_t *offset) {
25962595
uint64_t i, n;
25972596
uint64_t q, xor_hash = 0;
@@ -2645,7 +2644,7 @@ int journal_file_copy_entry(JournalFile *from, JournalFile *to, Object *o, uint6
26452644
#ifdef HAVE_XZ
26462645
uint64_t rsize;
26472646

2648-
if (!uncompress_blob(o->data.payload, l, &from->compress_buffer, &from->compress_buffer_size, &rsize))
2647+
if (!uncompress_blob(o->data.payload, l, &from->compress_buffer, &from->compress_buffer_size, &rsize, 0))
26492648
return -EBADMSG;
26502649

26512650
data = from->compress_buffer;

src/journal/journal-internal.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,8 @@ struct sd_journal {
121121
uint64_t unique_offset;
122122

123123
bool on_network;
124+
125+
size_t data_threshold;
124126
};
125127

126128
char *journal_make_match_string(sd_journal *j);

src/journal/journal-verify.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ static int journal_file_object_verify(JournalFile *f, Object *o) {
6969

7070
if (!uncompress_blob(o->data.payload,
7171
le64toh(o->object.size) - offsetof(Object, data.payload),
72-
&b, &alloc, &b_size))
72+
&b, &alloc, &b_size, 0))
7373
return -EBADMSG;
7474

7575
h2 = hash64(b, b_size);

src/journal/journald-server.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -934,6 +934,8 @@ int server_flush_to_var(Server *s) {
934934
return r;
935935
}
936936

937+
sd_journal_set_data_threshold(j, 0);
938+
937939
SD_JOURNAL_FOREACH(j) {
938940
Object *o = NULL;
939941
JournalFile *f;

src/journal/libsystemd-journal.sym

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,4 +86,6 @@ global:
8686
sd_journal_fd_reliable;
8787
sd_journal_get_catalog;
8888
sd_journal_get_catalog_for_message_id;
89+
sd_journal_set_data_threshold;
90+
sd_journal_get_data_threshold;
8991
} LIBSYSTEMD_JOURNAL_195;

0 commit comments

Comments
 (0)