Changing values in dataframe iteraring over all rows and multiple columns
I need to change some values in my dataframe iterating over rows. For each row, if there is a 1 in some column I need to change 0 values in other columns to NA.
I have a code that works, but is super slow when using a bigger dataset.
data = data.frame(id=c("A","B","C"),V1=c(1,0,0),V2=c(0,0,0),V3=c(1,0,1))
cols = names(data)[2:4]
for (i in 1:nrow(data)){
if(any(data[i,cols]==1)){
data[i,cols][data[i,cols]==0]=NA
}
}
I have an example data set
data
id V1 V2 V3
1 A 1 0 1
2 B 0 0 0
3 C 0 0 1
and the expected (and the actual) result is
data
id V1 V2 V3
1 A 1 NA 1
2 B 0 0 0
3 C NA NA 1
How can I write this in a more optimal way?
r dataframe
New contributor
add a comment |
I need to change some values in my dataframe iterating over rows. For each row, if there is a 1 in some column I need to change 0 values in other columns to NA.
I have a code that works, but is super slow when using a bigger dataset.
data = data.frame(id=c("A","B","C"),V1=c(1,0,0),V2=c(0,0,0),V3=c(1,0,1))
cols = names(data)[2:4]
for (i in 1:nrow(data)){
if(any(data[i,cols]==1)){
data[i,cols][data[i,cols]==0]=NA
}
}
I have an example data set
data
id V1 V2 V3
1 A 1 0 1
2 B 0 0 0
3 C 0 0 1
and the expected (and the actual) result is
data
id V1 V2 V3
1 A 1 NA 1
2 B 0 0 0
3 C NA NA 1
How can I write this in a more optimal way?
r dataframe
New contributor
Tryi1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]
– akrun
Jan 18 at 14:14
add a comment |
I need to change some values in my dataframe iterating over rows. For each row, if there is a 1 in some column I need to change 0 values in other columns to NA.
I have a code that works, but is super slow when using a bigger dataset.
data = data.frame(id=c("A","B","C"),V1=c(1,0,0),V2=c(0,0,0),V3=c(1,0,1))
cols = names(data)[2:4]
for (i in 1:nrow(data)){
if(any(data[i,cols]==1)){
data[i,cols][data[i,cols]==0]=NA
}
}
I have an example data set
data
id V1 V2 V3
1 A 1 0 1
2 B 0 0 0
3 C 0 0 1
and the expected (and the actual) result is
data
id V1 V2 V3
1 A 1 NA 1
2 B 0 0 0
3 C NA NA 1
How can I write this in a more optimal way?
r dataframe
New contributor
I need to change some values in my dataframe iterating over rows. For each row, if there is a 1 in some column I need to change 0 values in other columns to NA.
I have a code that works, but is super slow when using a bigger dataset.
data = data.frame(id=c("A","B","C"),V1=c(1,0,0),V2=c(0,0,0),V3=c(1,0,1))
cols = names(data)[2:4]
for (i in 1:nrow(data)){
if(any(data[i,cols]==1)){
data[i,cols][data[i,cols]==0]=NA
}
}
I have an example data set
data
id V1 V2 V3
1 A 1 0 1
2 B 0 0 0
3 C 0 0 1
and the expected (and the actual) result is
data
id V1 V2 V3
1 A 1 NA 1
2 B 0 0 0
3 C NA NA 1
How can I write this in a more optimal way?
r dataframe
r dataframe
New contributor
New contributor
edited Jan 18 at 14:21
Ronak Shah
35.8k103856
35.8k103856
New contributor
asked Jan 18 at 14:11
user570271user570271
32
32
New contributor
New contributor
Tryi1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]
– akrun
Jan 18 at 14:14
add a comment |
Tryi1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]
– akrun
Jan 18 at 14:14
Try
i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]
– akrun
Jan 18 at 14:14
Try
i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]
– akrun
Jan 18 at 14:14
add a comment |
3 Answers
3
active
oldest
votes
A one-liner can be,
data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,],
data[rowSums(data[-1]) > 0,] == 0,
NA)
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
To avoid evaluating the same expression over and over again, we can define it first, i.e.
v1 <- rowSums(data[-1]) > 0
data[v1,] <- replace(data[v1,],
data[v1,] == 0,
NA)
add a comment |
It is easy with dplyr
assuming you want to change values for V1
and V2
column based on values in V3
. We can specify columns for whom we want to change values in mutate_at
and in funs
argument specify the condition for which you want to change values.
library(dplyr)
data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
add a comment |
We can do this in base R
, by creating a logical vector with rowSums
and then update the numeric columns based on this index
i1 <- rowSums(data[-1] == 1) > 0
data[-1][i1,] <- NA^ !data[-1][i1,]
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
If the index needs to be based on a single column, say 'V3', change the 'i1' to
i1 <- data$V3 == 1
and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (!
- returns TRUE for 0 values and all others FALSE). Then, using NA^
on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated
data[i1, 2:3] <- NA^!data[i1, 2:3]
What's the meaning of NA^ ?
– user570271
Jan 18 at 14:23
@user570271 It is an easier way to replace the TRUE values to NA, ie..v1 <- c(TRUE, FALSE, FALSE); NA^v1
. In the example 'data', we create a logical matrix with!data[i1, 2:3]
where TRUE values are 0 and all others FALSE.NA^
returns the TRUE to NA and others to 1
– akrun
Jan 18 at 14:24
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
user570271 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54255739%2fchanging-values-in-dataframe-iteraring-over-all-rows-and-multiple-columns%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
A one-liner can be,
data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,],
data[rowSums(data[-1]) > 0,] == 0,
NA)
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
To avoid evaluating the same expression over and over again, we can define it first, i.e.
v1 <- rowSums(data[-1]) > 0
data[v1,] <- replace(data[v1,],
data[v1,] == 0,
NA)
add a comment |
A one-liner can be,
data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,],
data[rowSums(data[-1]) > 0,] == 0,
NA)
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
To avoid evaluating the same expression over and over again, we can define it first, i.e.
v1 <- rowSums(data[-1]) > 0
data[v1,] <- replace(data[v1,],
data[v1,] == 0,
NA)
add a comment |
A one-liner can be,
data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,],
data[rowSums(data[-1]) > 0,] == 0,
NA)
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
To avoid evaluating the same expression over and over again, we can define it first, i.e.
v1 <- rowSums(data[-1]) > 0
data[v1,] <- replace(data[v1,],
data[v1,] == 0,
NA)
A one-liner can be,
data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,],
data[rowSums(data[-1]) > 0,] == 0,
NA)
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
To avoid evaluating the same expression over and over again, we can define it first, i.e.
v1 <- rowSums(data[-1]) > 0
data[v1,] <- replace(data[v1,],
data[v1,] == 0,
NA)
edited Jan 18 at 15:37
answered Jan 18 at 14:32
SotosSotos
29.1k51640
29.1k51640
add a comment |
add a comment |
It is easy with dplyr
assuming you want to change values for V1
and V2
column based on values in V3
. We can specify columns for whom we want to change values in mutate_at
and in funs
argument specify the condition for which you want to change values.
library(dplyr)
data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
add a comment |
It is easy with dplyr
assuming you want to change values for V1
and V2
column based on values in V3
. We can specify columns for whom we want to change values in mutate_at
and in funs
argument specify the condition for which you want to change values.
library(dplyr)
data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
add a comment |
It is easy with dplyr
assuming you want to change values for V1
and V2
column based on values in V3
. We can specify columns for whom we want to change values in mutate_at
and in funs
argument specify the condition for which you want to change values.
library(dplyr)
data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
It is easy with dplyr
assuming you want to change values for V1
and V2
column based on values in V3
. We can specify columns for whom we want to change values in mutate_at
and in funs
argument specify the condition for which you want to change values.
library(dplyr)
data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
answered Jan 18 at 14:16
Ronak ShahRonak Shah
35.8k103856
35.8k103856
add a comment |
add a comment |
We can do this in base R
, by creating a logical vector with rowSums
and then update the numeric columns based on this index
i1 <- rowSums(data[-1] == 1) > 0
data[-1][i1,] <- NA^ !data[-1][i1,]
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
If the index needs to be based on a single column, say 'V3', change the 'i1' to
i1 <- data$V3 == 1
and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (!
- returns TRUE for 0 values and all others FALSE). Then, using NA^
on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated
data[i1, 2:3] <- NA^!data[i1, 2:3]
What's the meaning of NA^ ?
– user570271
Jan 18 at 14:23
@user570271 It is an easier way to replace the TRUE values to NA, ie..v1 <- c(TRUE, FALSE, FALSE); NA^v1
. In the example 'data', we create a logical matrix with!data[i1, 2:3]
where TRUE values are 0 and all others FALSE.NA^
returns the TRUE to NA and others to 1
– akrun
Jan 18 at 14:24
add a comment |
We can do this in base R
, by creating a logical vector with rowSums
and then update the numeric columns based on this index
i1 <- rowSums(data[-1] == 1) > 0
data[-1][i1,] <- NA^ !data[-1][i1,]
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
If the index needs to be based on a single column, say 'V3', change the 'i1' to
i1 <- data$V3 == 1
and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (!
- returns TRUE for 0 values and all others FALSE). Then, using NA^
on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated
data[i1, 2:3] <- NA^!data[i1, 2:3]
What's the meaning of NA^ ?
– user570271
Jan 18 at 14:23
@user570271 It is an easier way to replace the TRUE values to NA, ie..v1 <- c(TRUE, FALSE, FALSE); NA^v1
. In the example 'data', we create a logical matrix with!data[i1, 2:3]
where TRUE values are 0 and all others FALSE.NA^
returns the TRUE to NA and others to 1
– akrun
Jan 18 at 14:24
add a comment |
We can do this in base R
, by creating a logical vector with rowSums
and then update the numeric columns based on this index
i1 <- rowSums(data[-1] == 1) > 0
data[-1][i1,] <- NA^ !data[-1][i1,]
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
If the index needs to be based on a single column, say 'V3', change the 'i1' to
i1 <- data$V3 == 1
and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (!
- returns TRUE for 0 values and all others FALSE). Then, using NA^
on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated
data[i1, 2:3] <- NA^!data[i1, 2:3]
We can do this in base R
, by creating a logical vector with rowSums
and then update the numeric columns based on this index
i1 <- rowSums(data[-1] == 1) > 0
data[-1][i1,] <- NA^ !data[-1][i1,]
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1
If the index needs to be based on a single column, say 'V3', change the 'i1' to
i1 <- data$V3 == 1
and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (!
- returns TRUE for 0 values and all others FALSE). Then, using NA^
on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated
data[i1, 2:3] <- NA^!data[i1, 2:3]
edited Jan 18 at 14:28
answered Jan 18 at 14:18
akrunakrun
402k13194266
402k13194266
What's the meaning of NA^ ?
– user570271
Jan 18 at 14:23
@user570271 It is an easier way to replace the TRUE values to NA, ie..v1 <- c(TRUE, FALSE, FALSE); NA^v1
. In the example 'data', we create a logical matrix with!data[i1, 2:3]
where TRUE values are 0 and all others FALSE.NA^
returns the TRUE to NA and others to 1
– akrun
Jan 18 at 14:24
add a comment |
What's the meaning of NA^ ?
– user570271
Jan 18 at 14:23
@user570271 It is an easier way to replace the TRUE values to NA, ie..v1 <- c(TRUE, FALSE, FALSE); NA^v1
. In the example 'data', we create a logical matrix with!data[i1, 2:3]
where TRUE values are 0 and all others FALSE.NA^
returns the TRUE to NA and others to 1
– akrun
Jan 18 at 14:24
What's the meaning of NA^ ?
– user570271
Jan 18 at 14:23
What's the meaning of NA^ ?
– user570271
Jan 18 at 14:23
@user570271 It is an easier way to replace the TRUE values to NA, ie..
v1 <- c(TRUE, FALSE, FALSE); NA^v1
. In the example 'data', we create a logical matrix with !data[i1, 2:3]
where TRUE values are 0 and all others FALSE. NA^
returns the TRUE to NA and others to 1– akrun
Jan 18 at 14:24
@user570271 It is an easier way to replace the TRUE values to NA, ie..
v1 <- c(TRUE, FALSE, FALSE); NA^v1
. In the example 'data', we create a logical matrix with !data[i1, 2:3]
where TRUE values are 0 and all others FALSE. NA^
returns the TRUE to NA and others to 1– akrun
Jan 18 at 14:24
add a comment |
user570271 is a new contributor. Be nice, and check out our Code of Conduct.
user570271 is a new contributor. Be nice, and check out our Code of Conduct.
user570271 is a new contributor. Be nice, and check out our Code of Conduct.
user570271 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54255739%2fchanging-values-in-dataframe-iteraring-over-all-rows-and-multiple-columns%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Try
i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]
– akrun
Jan 18 at 14:14