Skip to content

Commit be67a4d

Browse files
aitapben-schwen
andauthored
dogroups: don't access columns beyond end of table (#7485)
* dogroups: don't access columns beyond end of table Previously, dogroups() could try to read elements after the (resized) end of over-allocated data.table list, expecting them to be NULL. This didn't crash in practice, but is now explicitly checked for (and disallowed). * add NEWS --------- Co-authored-by: Benjamin Schwendinger <benjaminschwe@gmail.com>
1 parent 348aaf4 commit be67a4d

File tree

2 files changed

+6
-4
lines changed

2 files changed

+6
-4
lines changed

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -352,6 +352,8 @@ See [#2611](https://github.com/Rdatatable/data.table/issues/2611) for details. T
352352

353353
26. Grouping by a factor with many groups is now fast again, fixing a timing regression introduced in [#6890](https://github.com/Rdatatable/data.table/pull/6890) where UTF-8 coercion and level remapping were performed unnecessarily, [#7404](https://github.com/Rdatatable/data.table/issues/7404). Thanks @ben-schwen for the report and fix.
354354

355+
27. `dogroups()` no longer reads beyond the resized end of over-allocated data.table list columns, [#7486](https://github.com/Rdatatable/data.table/issues/7486). While this didn't crash in practice, it is now explicitly checked for in recent R versions (r89198+). Thanks @TimTaylor and @aitap for the report and @aitap for the fix.
356+
355357
### NOTES
356358

357359
1. The following in-progress deprecations have proceeded:

src/dogroups.c

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -308,17 +308,16 @@ SEXP dogroups(SEXP dt, SEXP dtcols, SEXP groups, SEXP grpcols, SEXP jiscols, SEX
308308
error(_("RHS of := is NULL during grouped assignment, but it's not possible to delete parts of a column."));
309309
int vlen = length(RHS);
310310
if (vlen>1 && vlen!=grpn) {
311-
SEXP colname = isNull(VECTOR_ELT(dt, INTEGER(lhs)[j]-1)) ? STRING_ELT(newnames, INTEGER(lhs)[j]-origncol-1) : STRING_ELT(dtnames,INTEGER(lhs)[j]-1);
311+
SEXP colname = INTEGER(lhs)[j] > LENGTH(dt) ? STRING_ELT(newnames, INTEGER(lhs)[j]-origncol-1) : STRING_ELT(dtnames,INTEGER(lhs)[j]-1);
312312
error(_("Supplied %d items to be assigned to group %d of size %d in column '%s'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code."),vlen,i+1,grpn,CHAR(colname));
313313
// e.g. in #91 `:=` did not issue recycling warning during grouping. Now it is error not warning.
314314
}
315315
}
316316
int n = LENGTH(VECTOR_ELT(dt, 0));
317317
for (int j=0; j<length(lhs); ++j) {
318318
int colj = INTEGER(lhs)[j]-1;
319-
target = VECTOR_ELT(dt, colj);
320319
RHS = VECTOR_ELT(jval,j%LENGTH(jval));
321-
if (isNull(target)) {
320+
if (colj >= LENGTH(dt)) {
322321
// first time adding to new column
323322
if (R_maxLength(dt) < colj+1) internal_error(__func__, "Trying to add new column by reference but tl is full; setalloccol should have run first at R level before getting to this point"); // # nocov
324323
target = PROTECT(allocNAVectorLike(RHS, n));
@@ -332,7 +331,8 @@ SEXP dogroups(SEXP dt, SEXP dtcols, SEXP groups, SEXP grpcols, SEXP jiscols, SEX
332331
UNPROTECT(1);
333332
SET_STRING_ELT(dtnames, colj, STRING_ELT(newnames, colj-origncol));
334333
copyMostAttrib(RHS, target); // attributes of first group dominate; e.g. initial factor levels come from first group
335-
}
334+
} else
335+
target = VECTOR_ELT(dt, colj);
336336
bool copied = false;
337337
if (isNewList(target) && anySpecialStatic(RHS, specials)) { // see comments in anySpecialStatic()
338338
RHS = PROTECT(copyAsPlain(RHS));

0 commit comments

Comments
 (0)