37,662 questions
4
votes
2
answers
119
views
How to conditionally pass a vector to dplyr::select using quosures?
I am trying to set up a conditional argument in a function to drop columns from the tibble based on column name. There are a large number of columns but many are adjacent to one another and I would ...
Advice
0
votes
5
replies
182
views
Remove row with value x only if column contains both value x and y, by group
Below my clunky way to seek what I'm looking for, just to know if someone has a nicer, shorter method. Relatively concerned with performance. Ideally I'd try to avoid creating another object in the ...
-1
votes
0
answers
48
views
Why is dplyr::filter not working on the results of reshape2::melt? [duplicate]
I am trying to compare rows in a dataset and return a vector of all values that are NA in Row X but filled in Row Y. To do this I am reshaping the data using reshape2::melt and pivot_wider following ...
3
votes
4
answers
165
views
Filter grouped data by condition defined by changes in column values using R
Below is a data frame that is grouped by the id variable. Variable x takes on values from 0 to 2.
id<- c(1,1,1,2,2,3,3,4,4,4,4,4,4)
x <- c(2,2,0,0,0,1,2,1,1,1,0,2,1)
df <- data.frame(id, x)
...
1
vote
2
answers
111
views
How to use data from row before in a present row in one column?
I want to make a new column like in column B in R Studio. but its keep like column C
image example case for my case
If in excel the function is
I4 = IF(ABS(H4-H3)<0.1,I3+3600,3600).
In RStudio ...
0
votes
1
answer
91
views
How to left_join across multiple timepoints in long format [closed]
I'm trying to left_join data2 to data1, so that I end up with data3
data1 <- data.frame(ID = 1:4,
Time = 0,
X1 = c("a1", "a2", "...
0
votes
1
answer
115
views
How to create 2 new columns in R (date difference + convert Seasons to minutes)? [duplicate]
I am new to R and trying to create two new variables from my dataset.
My data frame is called netflix and it contains these relevant columns:
date_added and duration
Example values:
date_added: "...
2
votes
1
answer
112
views
Creating a unique animal identifier using multiple id columns some with missing data [duplicate]
I have a data set of mark-recapture data with the fields species, ear_tag, pit_tag, and sample_id. I need to create a animal_id column so I can identify each unique animal which unfortunately has ...
1
vote
1
answer
229
views
How to select data with the column names in a character string
TL;DR
In short, how do I get ok.netroles <- c("netrole1", "netrole2"); mutate(across(ok.netroles)... to work like mutate(across(.cols=netrole1, netrole2, ...))?
I am trying to ...
5
votes
1
answer
108
views
step_rename does not work like dplyr::rename
With dplyr::rename I can rename columns if they exist:
library(dplyr)
df <- data.frame(a_old = 1:3, b_new = 11:13)
lkp <- c(a_new = "a_old", b_new = "b_old")
df %>% rename(...
1
vote
2
answers
114
views
Add missing rows to dplyr column [duplicate]
I currently have the below data frame:
structure(list(cluster = c(2L, 3L, 5L, 5L, 6L, 6L, 7L, 9L, 9L,
10L, 10L), treatment = c("TreatmentA", "TreatmentA", "TreatmentA",
...
Tooling
1
vote
6
replies
335
views
Arranging data by combining three columns in r
I want to create a three layer pie chart using ggplot2 which requires that the data is arrange in certain way.
This is the data I have:
Level_1
Level_2
Level_3
Theme A
Subtheme 1
A
Theme A
Subtheme 1
...
-2
votes
1
answer
107
views
Dplyr filter function returning error that class is logical when it is numeric [closed]
I'm trying to use the filter function to remove values less than 10 from my dataset and create a new data frame, but I keep getting this error. The class of the data was originally integer but I used ...
0
votes
1
answer
95
views
Display frequency of column value with value label in R
I have SPSS dataset, which has variables- name, sex, car_owned, where the value label of sex is "1= Male, 2 =Female" and car_owned is "1=BMW, 2=BYD, 3=Nissan".
The SPSS dataset ...
0
votes
1
answer
130
views
How to add a new column in a dataframe matching the matrix or multiple matrices by date variables
I want to add a seasonal_factor column to my D1 data frame. The seasonal factor is from another source, in matrix format, with 4 matrices per year from 2021 to 2024.
I get errors on matching the same ...
1
vote
3
answers
204
views
Rolling forecast of column values where new values depend on previous known values, without for loops
I am predicting unknown values based on previous known values in a R dataframe:
year
pop
datasource
2020
5999124
observation
2021
6017961
observation
2022
6036526
observation
2023
6054829
observation
...
4
votes
3
answers
204
views
Row-wise operations on column subsets in dplyr
I have a dataset with nine cities. I trained and tested four different machine learning models for each city. The results are in the tibble below:
set.seed(1)
result <-
tibble::tibble(city = ...
3
votes
5
answers
206
views
How to check if all values exist by group in R
I would like to do a check on my dataset to make sure a certain set of values are in every group and output a dataset showing all the values I am checking for and whether they exist in each group. How ...
4
votes
2
answers
164
views
How to control program flow from data.frame values
Let's say I have a data.frame df:
#fake dataset
id <- c(1,2,3,4,5,6,7,8,9,10,11,12,13)
loc <- c(1,1,2,2,2,3,3,3,3,3,3,3,3)
date <- c(2021, 2022, 2021, 2021, 2022, 2021, 2021, 2022, 2023, 2023,...
3
votes
2
answers
242
views
R how to filter and exclude specific date intervals based on multiple conditions
library(lubridate)
test <- data.frame(Location=c("A1","A1","A1","A1","A1","A2","A1","A1","A1","A1"...
2
votes
1
answer
90
views
expand versus crossing in combination with nesting for tidyr::complete giving inconsistent behavior
I'm having some issues trying to understand how to use expand and nesting in combination with tidyr::complete() to create a data frame where I fill in zeros for missing species.
In short, while expand(...
1
vote
2
answers
87
views
checking length of (possibly NULL) data-masked argument for flow control in R, dplyr
I'm trying to write a flexible grouping and summarizing function, with the interface f(data, group_vars, unit_vols) and struggling to inspect what's been passed to these data-masking arguments for ...
0
votes
2
answers
84
views
Change frequency table to long form [duplicate]
I have a frequency table that I would like to reformat so that each row is for a single observation. For example, if my data looked like the following, I would like to have 10 rows where year = 1 and ...
4
votes
2
answers
145
views
Replace NA values in R dataframe across multiple columns using truncated names of other columns [duplicate]
I have the following data frame (example):
myfile <- data.frame(C1=c(1,3,4,5),
C2=c(5,4,6,7),
C3=c(0,1,3,2),
C1_A=c(NA,NA,1,2),
...
4
votes
2
answers
103
views
mutate(across(map())): Any way to include the column name in a .progress message from map()?
If you want to call a function that returns multiple values on a column of a data.frame, and append these values as new columns to the data.frame, you can mutate(map()) |> unnest_wider(). If it's ...
2
votes
1
answer
106
views
summarise(across(starts_with())) in R
I have a dataframe which has three years of observations per ID and an indicator showing whether the ID was above or below a threshold in each year much like this:
library(dplyr)
library(tidyr)
set....
3
votes
3
answers
140
views
How to simplify across?
I have to use across() multiple times. Is there a way to simplify the following across() code?
ori_df <- data.frame(base_1 = 1:3,base_2 = 7:9,base_3 =3:1,
exp_1 =c(0.4,0.1,0.7),...
4
votes
3
answers
140
views
Calculate extra values for moving average (and other functions)
Suppose I have the following dataset:
library(dplyr)
library(zoo)
df <- data.frame(date = seq.Date(from = "2025-01-01", to = "2025-01-10"),
value = 1:10)
df
#&...
1
vote
1
answer
90
views
Can distinct in dplyr keep only columns that have single unique value?
I want to use distinct to collapse a data.frame based on distinct values of a column sample_id. I would like to only keep the columns that also have a single value for each distinct value of sample_id,...
0
votes
1
answer
105
views
bigrquery & DBI Error: Syntax error: Unexpected string literal 'table_name'
There appears to be a change or error with the bigrquery package. To connect to bigquery using DBI and dplyr you previously needed to do the following, which I got from https://github.com/r-dbi/...
0
votes
1
answer
166
views
How to mutate values in one dataframe from a different, summarised dataframe
I am attempting to blank correct some data, but I need to do so in a specific order. To do this, I need to find the average blank measurement for a given time-point, and then subtract it from every ...
6
votes
5
answers
221
views
Referencing a vector of values from other columns within dplyr::case_when()
I am using dplyr::case_when() to create a new column based on other columns of my dataframe. One of the cases is when a subset of the other columns are NA, then my new column is also NA. Here is a ...
4
votes
4
answers
161
views
Assign top 2 ranking by group
I am trying to assign a ranking across a group, and apply the ranking across the whole group
My data look like this:
colour <- rep(c("blue", "red"), 3)
day <- rep(c("mon&...
2
votes
1
answer
107
views
How can we add all-zero rows and columns in a table made by tbl_hierarchical?
let's just say that I have a table template, and I need to populate all the groups and values regardless of the analysis data, all-zero row or columns, I tried this way:
library(gtsummary)
...
6
votes
5
answers
507
views
Conditional counting based on multiple conditions
I have a tidy ecological dataset in which every row is a single specimen/individual, with multiple columns for multiple variables.
#fake dataset
loc <- c(1,1,2,2,2,3,3,3,3,3,3,3,3)
date <- c(...
0
votes
0
answers
56
views
Dplyr select top n% of a dataframe by column value [duplicate]
I am trying to filter this toy dataframe dat to remove the bottom 10% of num, and am trying to adapt the code from this source in a simpler format for my needs. I have 1:10 in num, so I would like to ...
1
vote
1
answer
86
views
Estimate time spent in each cycle occurence of a factor variable [closed]
Temperature is meaesured in each two minutes and temperatures between 2.0 and 7.9 are fine, 8.0 and above are hot, 0.0 and 1.9 are cold, and minus zero is too cold. I want to measure time in each ...
5
votes
3
answers
246
views
Efficiently group rows within tolerance for multiple numeric columns
I'm trying to group rows that have values within specific error/tolerance.
Input looks like this:
input <- data.frame(Row_number = 1:22,
Name = c(rep("A",6), rep("...
2
votes
1
answer
116
views
Programmatically filter using expressions stored in another data frame
I have a data frame containing the specification for a set of regression models (regress_grid) with a column for different aspects of the model. I then use dplyr::rowwise() to estimate a model for ...
1
vote
2
answers
90
views
How to mutate variable according the difference of to characters
library(dplyr)
raw_data <- data.frame(cat=c("a"),char_1=c("1kg|2kg"),char_2 = c("0kg|1kg|8kg"))
in raw_data how to mutate dif_char which values in char_2 and not in ...
0
votes
2
answers
126
views
Subtract values from following rows and calculate percent change
I have a dataframe where I wish to generate an additional column with the value of the percent change calculated as the difference from the former row divided by the original value, then multiplied by ...
8
votes
8
answers
1k
views
How can I split a vector of numbers into its digits when vector values differ in length?
I'm having trouble with the following in R using dplyr and separate. I have a dataframe like this:
x<-c(1,2,12,31,123,2341)
df<-data.frame(x)
and I'd like to split each digit into its ...
0
votes
1
answer
116
views
In `dplyr` `filter`, how to parse conditions which are stored in string? [duplicate]
In dplyr filter, how to parse conditions which are stored in string?When there's only one condition, below code can work
library(dplyr)
conditions_string_1 <- "Species=='versicolor'"
iris ...
0
votes
1
answer
76
views
How to group by one column and select the presence of two strings in another column [duplicate]
I am trying to subset the data so that when I group by letter, I subset only for letters that have BOTH 'apple' and 'cherry'
This is the code I have used but 'filtered_data' is showing empty. What am ...
3
votes
2
answers
123
views
Filter grouped data with filter function and if else condition in R
Consider the data frame
id <-c(1,1,1,2,2,3,3,3,3)
x1 <-c("no","yes","yes","no","no","no","no","no","yes")...
2
votes
3
answers
158
views
How to pass string into function from within mutate
I'm working with pre/post assessment data. Each question has a score out of 5 and is collected at two timepoints for each student.
library(tidyverse)
name <- c("Student 1", "Student ...
3
votes
3
answers
163
views
Calculate difference of means of multiple variables by cateogries and groups
I want to calculate the difference of the means of mutliple variables (e.g., size, weight) by categories (e.g., color, taste) and groups (e.g., fruit) in R.
So I want to transform this
fruit
color
...
0
votes
2
answers
91
views
How to reorganize data based on unique cases of a variable? [duplicate]
I have a dataset that has course completions from a variety of subject areas for a very large number of students. Each row of data is organized as follows:
Student ID
Gender
City
Course
ETC.
10102
...
0
votes
1
answer
59
views
Proportionately redistributing values
(The last) follow up question: Adjusting values in a table based on another table, Splitting multiple values among multiple values
I have these tables in R:
myt_with_counts <- data.frame(
name = c(...
0
votes
2
answers
72
views
Splitting multiple values among multiple values
This is a follow-up question: Adjusting values in a table based on another table
I have two tables:
myt_repeating <- data.frame(
name = c("a", "a", "a", "b"...