Changing values in dataframe iteraring over all rows and multiple columns












0















I need to change some values in my dataframe iterating over rows. For each row, if there is a 1 in some column I need to change 0 values in other columns to NA.



I have a code that works, but is super slow when using a bigger dataset.



data = data.frame(id=c("A","B","C"),V1=c(1,0,0),V2=c(0,0,0),V3=c(1,0,1))
cols = names(data)[2:4]

for (i in 1:nrow(data)){
if(any(data[i,cols]==1)){
data[i,cols][data[i,cols]==0]=NA
}
}


I have an example data set



data
id V1 V2 V3
1 A 1 0 1
2 B 0 0 0
3 C 0 0 1


and the expected (and the actual) result is



data
id V1 V2 V3
1 A 1 NA 1
2 B 0 0 0
3 C NA NA 1


How can I write this in a more optimal way?










share|improve this question









New contributor




user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Try i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]

    – akrun
    Jan 18 at 14:14
















0















I need to change some values in my dataframe iterating over rows. For each row, if there is a 1 in some column I need to change 0 values in other columns to NA.



I have a code that works, but is super slow when using a bigger dataset.



data = data.frame(id=c("A","B","C"),V1=c(1,0,0),V2=c(0,0,0),V3=c(1,0,1))
cols = names(data)[2:4]

for (i in 1:nrow(data)){
if(any(data[i,cols]==1)){
data[i,cols][data[i,cols]==0]=NA
}
}


I have an example data set



data
id V1 V2 V3
1 A 1 0 1
2 B 0 0 0
3 C 0 0 1


and the expected (and the actual) result is



data
id V1 V2 V3
1 A 1 NA 1
2 B 0 0 0
3 C NA NA 1


How can I write this in a more optimal way?










share|improve this question









New contributor




user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Try i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]

    – akrun
    Jan 18 at 14:14














0












0








0








I need to change some values in my dataframe iterating over rows. For each row, if there is a 1 in some column I need to change 0 values in other columns to NA.



I have a code that works, but is super slow when using a bigger dataset.



data = data.frame(id=c("A","B","C"),V1=c(1,0,0),V2=c(0,0,0),V3=c(1,0,1))
cols = names(data)[2:4]

for (i in 1:nrow(data)){
if(any(data[i,cols]==1)){
data[i,cols][data[i,cols]==0]=NA
}
}


I have an example data set



data
id V1 V2 V3
1 A 1 0 1
2 B 0 0 0
3 C 0 0 1


and the expected (and the actual) result is



data
id V1 V2 V3
1 A 1 NA 1
2 B 0 0 0
3 C NA NA 1


How can I write this in a more optimal way?










share|improve this question









New contributor




user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I need to change some values in my dataframe iterating over rows. For each row, if there is a 1 in some column I need to change 0 values in other columns to NA.



I have a code that works, but is super slow when using a bigger dataset.



data = data.frame(id=c("A","B","C"),V1=c(1,0,0),V2=c(0,0,0),V3=c(1,0,1))
cols = names(data)[2:4]

for (i in 1:nrow(data)){
if(any(data[i,cols]==1)){
data[i,cols][data[i,cols]==0]=NA
}
}


I have an example data set



data
id V1 V2 V3
1 A 1 0 1
2 B 0 0 0
3 C 0 0 1


and the expected (and the actual) result is



data
id V1 V2 V3
1 A 1 NA 1
2 B 0 0 0
3 C NA NA 1


How can I write this in a more optimal way?







r dataframe






share|improve this question









New contributor




user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited Jan 18 at 14:21









Ronak Shah

35.8k103856




35.8k103856






New contributor




user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Jan 18 at 14:11









user570271user570271

32




32




New contributor




user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






user570271 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • Try i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]

    – akrun
    Jan 18 at 14:14



















  • Try i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]

    – akrun
    Jan 18 at 14:14

















Try i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]

– akrun
Jan 18 at 14:14





Try i1 <- rowSums(data[-1] == 1) > 0;data[-1][i1,] <- NA^ !data[-1][i1,]

– akrun
Jan 18 at 14:14












3 Answers
3






active

oldest

votes


















0














A one-liner can be,



data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,], 
data[rowSums(data[-1]) > 0,] == 0,
NA)
data
# id V1 V2 V3
#1 A 1 NA 1
#2 B 0 0 0
#3 C NA NA 1


To avoid evaluating the same expression over and over again, we can define it first, i.e.



v1 <- rowSums(data[-1]) > 0
data[v1,] <- replace(data[v1,],
data[v1,] == 0,
NA)





share|improve this answer

































    0














    It is easy with dplyr assuming you want to change values for V1 and V2 column based on values in V3. We can specify columns for whom we want to change values in mutate_at and in funs argument specify the condition for which you want to change values.



    library(dplyr)

    data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))

    # id V1 V2 V3
    #1 A 1 NA 1
    #2 B 0 0 0
    #3 C NA NA 1





    share|improve this answer































      0














      We can do this in base R, by creating a logical vector with rowSums and then update the numeric columns based on this index



      i1 <- rowSums(data[-1] == 1) > 0
      data[-1][i1,] <- NA^ !data[-1][i1,]
      data
      # id V1 V2 V3
      #1 A 1 NA 1
      #2 B 0 0 0
      #3 C NA NA 1




      If the index needs to be based on a single column, say 'V3', change the 'i1' to



      i1 <- data$V3 == 1


      and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (! - returns TRUE for 0 values and all others FALSE). Then, using NA^ on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated



      data[i1, 2:3] <- NA^!data[i1, 2:3]





      share|improve this answer


























      • What's the meaning of NA^ ?

        – user570271
        Jan 18 at 14:23











      • @user570271 It is an easier way to replace the TRUE values to NA, ie.. v1 <- c(TRUE, FALSE, FALSE); NA^v1. In the example 'data', we create a logical matrix with !data[i1, 2:3] where TRUE values are 0 and all others FALSE. NA^ returns the TRUE to NA and others to 1

        – akrun
        Jan 18 at 14:24













      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });






      user570271 is a new contributor. Be nice, and check out our Code of Conduct.










      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54255739%2fchanging-values-in-dataframe-iteraring-over-all-rows-and-multiple-columns%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      A one-liner can be,



      data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,], 
      data[rowSums(data[-1]) > 0,] == 0,
      NA)
      data
      # id V1 V2 V3
      #1 A 1 NA 1
      #2 B 0 0 0
      #3 C NA NA 1


      To avoid evaluating the same expression over and over again, we can define it first, i.e.



      v1 <- rowSums(data[-1]) > 0
      data[v1,] <- replace(data[v1,],
      data[v1,] == 0,
      NA)





      share|improve this answer






























        0














        A one-liner can be,



        data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,], 
        data[rowSums(data[-1]) > 0,] == 0,
        NA)
        data
        # id V1 V2 V3
        #1 A 1 NA 1
        #2 B 0 0 0
        #3 C NA NA 1


        To avoid evaluating the same expression over and over again, we can define it first, i.e.



        v1 <- rowSums(data[-1]) > 0
        data[v1,] <- replace(data[v1,],
        data[v1,] == 0,
        NA)





        share|improve this answer




























          0












          0








          0







          A one-liner can be,



          data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,], 
          data[rowSums(data[-1]) > 0,] == 0,
          NA)
          data
          # id V1 V2 V3
          #1 A 1 NA 1
          #2 B 0 0 0
          #3 C NA NA 1


          To avoid evaluating the same expression over and over again, we can define it first, i.e.



          v1 <- rowSums(data[-1]) > 0
          data[v1,] <- replace(data[v1,],
          data[v1,] == 0,
          NA)





          share|improve this answer















          A one-liner can be,



          data[rowSums(data[-1]) > 0,] <- replace(data[rowSums(data[-1]) > 0,], 
          data[rowSums(data[-1]) > 0,] == 0,
          NA)
          data
          # id V1 V2 V3
          #1 A 1 NA 1
          #2 B 0 0 0
          #3 C NA NA 1


          To avoid evaluating the same expression over and over again, we can define it first, i.e.



          v1 <- rowSums(data[-1]) > 0
          data[v1,] <- replace(data[v1,],
          data[v1,] == 0,
          NA)






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jan 18 at 15:37

























          answered Jan 18 at 14:32









          SotosSotos

          29.1k51640




          29.1k51640

























              0














              It is easy with dplyr assuming you want to change values for V1 and V2 column based on values in V3. We can specify columns for whom we want to change values in mutate_at and in funs argument specify the condition for which you want to change values.



              library(dplyr)

              data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))

              # id V1 V2 V3
              #1 A 1 NA 1
              #2 B 0 0 0
              #3 C NA NA 1





              share|improve this answer




























                0














                It is easy with dplyr assuming you want to change values for V1 and V2 column based on values in V3. We can specify columns for whom we want to change values in mutate_at and in funs argument specify the condition for which you want to change values.



                library(dplyr)

                data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))

                # id V1 V2 V3
                #1 A 1 NA 1
                #2 B 0 0 0
                #3 C NA NA 1





                share|improve this answer


























                  0












                  0








                  0







                  It is easy with dplyr assuming you want to change values for V1 and V2 column based on values in V3. We can specify columns for whom we want to change values in mutate_at and in funs argument specify the condition for which you want to change values.



                  library(dplyr)

                  data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))

                  # id V1 V2 V3
                  #1 A 1 NA 1
                  #2 B 0 0 0
                  #3 C NA NA 1





                  share|improve this answer













                  It is easy with dplyr assuming you want to change values for V1 and V2 column based on values in V3. We can specify columns for whom we want to change values in mutate_at and in funs argument specify the condition for which you want to change values.



                  library(dplyr)

                  data %>% mutate_at(vars(V1:V2), funs(replace(., V3 == 1 & . == 0, NA)))

                  # id V1 V2 V3
                  #1 A 1 NA 1
                  #2 B 0 0 0
                  #3 C NA NA 1






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jan 18 at 14:16









                  Ronak ShahRonak Shah

                  35.8k103856




                  35.8k103856























                      0














                      We can do this in base R, by creating a logical vector with rowSums and then update the numeric columns based on this index



                      i1 <- rowSums(data[-1] == 1) > 0
                      data[-1][i1,] <- NA^ !data[-1][i1,]
                      data
                      # id V1 V2 V3
                      #1 A 1 NA 1
                      #2 B 0 0 0
                      #3 C NA NA 1




                      If the index needs to be based on a single column, say 'V3', change the 'i1' to



                      i1 <- data$V3 == 1


                      and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (! - returns TRUE for 0 values and all others FALSE). Then, using NA^ on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated



                      data[i1, 2:3] <- NA^!data[i1, 2:3]





                      share|improve this answer


























                      • What's the meaning of NA^ ?

                        – user570271
                        Jan 18 at 14:23











                      • @user570271 It is an easier way to replace the TRUE values to NA, ie.. v1 <- c(TRUE, FALSE, FALSE); NA^v1. In the example 'data', we create a logical matrix with !data[i1, 2:3] where TRUE values are 0 and all others FALSE. NA^ returns the TRUE to NA and others to 1

                        – akrun
                        Jan 18 at 14:24


















                      0














                      We can do this in base R, by creating a logical vector with rowSums and then update the numeric columns based on this index



                      i1 <- rowSums(data[-1] == 1) > 0
                      data[-1][i1,] <- NA^ !data[-1][i1,]
                      data
                      # id V1 V2 V3
                      #1 A 1 NA 1
                      #2 B 0 0 0
                      #3 C NA NA 1




                      If the index needs to be based on a single column, say 'V3', change the 'i1' to



                      i1 <- data$V3 == 1


                      and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (! - returns TRUE for 0 values and all others FALSE). Then, using NA^ on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated



                      data[i1, 2:3] <- NA^!data[i1, 2:3]





                      share|improve this answer


























                      • What's the meaning of NA^ ?

                        – user570271
                        Jan 18 at 14:23











                      • @user570271 It is an easier way to replace the TRUE values to NA, ie.. v1 <- c(TRUE, FALSE, FALSE); NA^v1. In the example 'data', we create a logical matrix with !data[i1, 2:3] where TRUE values are 0 and all others FALSE. NA^ returns the TRUE to NA and others to 1

                        – akrun
                        Jan 18 at 14:24
















                      0












                      0








                      0







                      We can do this in base R, by creating a logical vector with rowSums and then update the numeric columns based on this index



                      i1 <- rowSums(data[-1] == 1) > 0
                      data[-1][i1,] <- NA^ !data[-1][i1,]
                      data
                      # id V1 V2 V3
                      #1 A 1 NA 1
                      #2 B 0 0 0
                      #3 C NA NA 1




                      If the index needs to be based on a single column, say 'V3', change the 'i1' to



                      i1 <- data$V3 == 1


                      and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (! - returns TRUE for 0 values and all others FALSE). Then, using NA^ on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated



                      data[i1, 2:3] <- NA^!data[i1, 2:3]





                      share|improve this answer















                      We can do this in base R, by creating a logical vector with rowSums and then update the numeric columns based on this index



                      i1 <- rowSums(data[-1] == 1) > 0
                      data[-1][i1,] <- NA^ !data[-1][i1,]
                      data
                      # id V1 V2 V3
                      #1 A 1 NA 1
                      #2 B 0 0 0
                      #3 C NA NA 1




                      If the index needs to be based on a single column, say 'V3', change the 'i1' to



                      i1 <- data$V3 == 1


                      and update the other numeric columns after subsetting the rows with 'i1', create a logical matrix with negation (! - returns TRUE for 0 values and all others FALSE). Then, using NA^ on logical matrix returns NA for TRUE and 1 for other values. As there are only binary values, this can be updated



                      data[i1, 2:3] <- NA^!data[i1, 2:3]






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Jan 18 at 14:28

























                      answered Jan 18 at 14:18









                      akrunakrun

                      402k13194266




                      402k13194266













                      • What's the meaning of NA^ ?

                        – user570271
                        Jan 18 at 14:23











                      • @user570271 It is an easier way to replace the TRUE values to NA, ie.. v1 <- c(TRUE, FALSE, FALSE); NA^v1. In the example 'data', we create a logical matrix with !data[i1, 2:3] where TRUE values are 0 and all others FALSE. NA^ returns the TRUE to NA and others to 1

                        – akrun
                        Jan 18 at 14:24





















                      • What's the meaning of NA^ ?

                        – user570271
                        Jan 18 at 14:23











                      • @user570271 It is an easier way to replace the TRUE values to NA, ie.. v1 <- c(TRUE, FALSE, FALSE); NA^v1. In the example 'data', we create a logical matrix with !data[i1, 2:3] where TRUE values are 0 and all others FALSE. NA^ returns the TRUE to NA and others to 1

                        – akrun
                        Jan 18 at 14:24



















                      What's the meaning of NA^ ?

                      – user570271
                      Jan 18 at 14:23





                      What's the meaning of NA^ ?

                      – user570271
                      Jan 18 at 14:23













                      @user570271 It is an easier way to replace the TRUE values to NA, ie.. v1 <- c(TRUE, FALSE, FALSE); NA^v1. In the example 'data', we create a logical matrix with !data[i1, 2:3] where TRUE values are 0 and all others FALSE. NA^ returns the TRUE to NA and others to 1

                      – akrun
                      Jan 18 at 14:24







                      @user570271 It is an easier way to replace the TRUE values to NA, ie.. v1 <- c(TRUE, FALSE, FALSE); NA^v1. In the example 'data', we create a logical matrix with !data[i1, 2:3] where TRUE values are 0 and all others FALSE. NA^ returns the TRUE to NA and others to 1

                      – akrun
                      Jan 18 at 14:24












                      user570271 is a new contributor. Be nice, and check out our Code of Conduct.










                      draft saved

                      draft discarded


















                      user570271 is a new contributor. Be nice, and check out our Code of Conduct.













                      user570271 is a new contributor. Be nice, and check out our Code of Conduct.












                      user570271 is a new contributor. Be nice, and check out our Code of Conduct.
















                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54255739%2fchanging-values-in-dataframe-iteraring-over-all-rows-and-multiple-columns%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How fix org.hibernate.TransientPropertyValueException

                      Updating UILabel text programmatically using a function

                      Cloud Functions - OpenCV Videocapture Read method fails for larger files from cloud storage