How to count the number of underscores and split the string on the middle one only?












1















I would like to count the number of underscores and split the string into two different strings at the middle underscore.



strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz", "bb_dd")


Desired Output:



First        Last
"aa_bb_cc" "dd_ee_ff"
"cc_hh" "ff_zz"
"bb" "dd"









share|improve this question




















  • 5





    Possible duplicate of Split on first/nth occurrence of delimiter

    – markus
    Jan 18 at 19:07






  • 1





    What happens when there are an even number of underscores (e.g., aa_bb_cc)?

    – Lyngbakr
    Jan 18 at 19:09
















1















I would like to count the number of underscores and split the string into two different strings at the middle underscore.



strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz", "bb_dd")


Desired Output:



First        Last
"aa_bb_cc" "dd_ee_ff"
"cc_hh" "ff_zz"
"bb" "dd"









share|improve this question




















  • 5





    Possible duplicate of Split on first/nth occurrence of delimiter

    – markus
    Jan 18 at 19:07






  • 1





    What happens when there are an even number of underscores (e.g., aa_bb_cc)?

    – Lyngbakr
    Jan 18 at 19:09














1












1








1








I would like to count the number of underscores and split the string into two different strings at the middle underscore.



strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz", "bb_dd")


Desired Output:



First        Last
"aa_bb_cc" "dd_ee_ff"
"cc_hh" "ff_zz"
"bb" "dd"









share|improve this question
















I would like to count the number of underscores and split the string into two different strings at the middle underscore.



strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz", "bb_dd")


Desired Output:



First        Last
"aa_bb_cc" "dd_ee_ff"
"cc_hh" "ff_zz"
"bb" "dd"






r string






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 18 at 20:41









IceCreamToucan

9,3161816




9,3161816










asked Jan 18 at 18:58









ldanldan

192




192








  • 5





    Possible duplicate of Split on first/nth occurrence of delimiter

    – markus
    Jan 18 at 19:07






  • 1





    What happens when there are an even number of underscores (e.g., aa_bb_cc)?

    – Lyngbakr
    Jan 18 at 19:09














  • 5





    Possible duplicate of Split on first/nth occurrence of delimiter

    – markus
    Jan 18 at 19:07






  • 1





    What happens when there are an even number of underscores (e.g., aa_bb_cc)?

    – Lyngbakr
    Jan 18 at 19:09








5




5





Possible duplicate of Split on first/nth occurrence of delimiter

– markus
Jan 18 at 19:07





Possible duplicate of Split on first/nth occurrence of delimiter

– markus
Jan 18 at 19:07




1




1





What happens when there are an even number of underscores (e.g., aa_bb_cc)?

– Lyngbakr
Jan 18 at 19:09





What happens when there are an even number of underscores (e.g., aa_bb_cc)?

– Lyngbakr
Jan 18 at 19:09












3 Answers
3






active

oldest

votes


















3














Here's a cludgy solution that assumes that there are always an odd number of underscores.





# Load libraries
library(stringr)

# Define function
even_split <- function(s){
# Split string
tmp <- str_split(s, "_")

lapply(tmp, function(x){
# Patch string back together in two pieces
c(paste(x[1:(length(x)/2)], collapse = "_"),
paste(x[(1+length(x)/2):length(x)], collapse = "_"))
})
}

# Example
strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

# Test function
even_split(strings)
#> [[1]]
#> [1] "aa_bb_cc" "dd_ee_ff"
#>
#> [[2]]
#> [1] "cc_hh" "ff_zz"
#>
#> [[3]]
#> [1] "bb" "dd"


Created on 2019-01-18 by the reprex package (v0.2.1)






share|improve this answer































    2














    Adapting nhahtdh's answer here, all you need to do is add a step to count the underscores (done here with str_count) and return the median number of underscores.



    library(stringr)

    strsplit(
    strings,
    paste0("^[^_]*(?:_[^_]*){", str_count(strings, '_') %/% 2, "}\K_"),
    perl = TRUE)

    # [[1]]
    # [1] "aa_bb_cc" "dd_ee_ff"
    #
    # [[2]]
    # [1] "cc_hh" "ff_zz"
    #
    # [[3]]
    # [1] "bb" "dd"





    share|improve this answer

































      1














      This assumes an odd number of underscores, and 99 or fewer.



      library(stringr)
      library(strex)
      strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

      splitMiddleUnderscore <- function(x){
      nUnderscore <- str_count(x, '_')
      middleUnderscore <- match(nUnderscore, seq(1, 99, 2))
      str1 <- str_before_nth(x, '_', middleUnderscore)
      str2 <- str_after_nth(x, '_', middleUnderscore)
      c(str1, str2)
      }

      lapply(strings, splitMiddleUnderscore)

      #[[1]]
      #[1] "aa_bb_cc" "dd_ee_ff"

      #[[2]]
      #[1] "cc_hh" "ff_zz"

      #[[3]]
      #[1] "bb" "dd"





      share|improve this answer



















      • 1





        you can use middleUnderscore <- str_count(x, '_') %/% 2 + 1 to avoid the "99 or fewer" requirement.

        – IceCreamToucan
        Jan 18 at 20:11













      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54259953%2fhow-to-count-the-number-of-underscores-and-split-the-string-on-the-middle-one-on%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3














      Here's a cludgy solution that assumes that there are always an odd number of underscores.





      # Load libraries
      library(stringr)

      # Define function
      even_split <- function(s){
      # Split string
      tmp <- str_split(s, "_")

      lapply(tmp, function(x){
      # Patch string back together in two pieces
      c(paste(x[1:(length(x)/2)], collapse = "_"),
      paste(x[(1+length(x)/2):length(x)], collapse = "_"))
      })
      }

      # Example
      strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

      # Test function
      even_split(strings)
      #> [[1]]
      #> [1] "aa_bb_cc" "dd_ee_ff"
      #>
      #> [[2]]
      #> [1] "cc_hh" "ff_zz"
      #>
      #> [[3]]
      #> [1] "bb" "dd"


      Created on 2019-01-18 by the reprex package (v0.2.1)






      share|improve this answer




























        3














        Here's a cludgy solution that assumes that there are always an odd number of underscores.





        # Load libraries
        library(stringr)

        # Define function
        even_split <- function(s){
        # Split string
        tmp <- str_split(s, "_")

        lapply(tmp, function(x){
        # Patch string back together in two pieces
        c(paste(x[1:(length(x)/2)], collapse = "_"),
        paste(x[(1+length(x)/2):length(x)], collapse = "_"))
        })
        }

        # Example
        strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

        # Test function
        even_split(strings)
        #> [[1]]
        #> [1] "aa_bb_cc" "dd_ee_ff"
        #>
        #> [[2]]
        #> [1] "cc_hh" "ff_zz"
        #>
        #> [[3]]
        #> [1] "bb" "dd"


        Created on 2019-01-18 by the reprex package (v0.2.1)






        share|improve this answer


























          3












          3








          3







          Here's a cludgy solution that assumes that there are always an odd number of underscores.





          # Load libraries
          library(stringr)

          # Define function
          even_split <- function(s){
          # Split string
          tmp <- str_split(s, "_")

          lapply(tmp, function(x){
          # Patch string back together in two pieces
          c(paste(x[1:(length(x)/2)], collapse = "_"),
          paste(x[(1+length(x)/2):length(x)], collapse = "_"))
          })
          }

          # Example
          strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

          # Test function
          even_split(strings)
          #> [[1]]
          #> [1] "aa_bb_cc" "dd_ee_ff"
          #>
          #> [[2]]
          #> [1] "cc_hh" "ff_zz"
          #>
          #> [[3]]
          #> [1] "bb" "dd"


          Created on 2019-01-18 by the reprex package (v0.2.1)






          share|improve this answer













          Here's a cludgy solution that assumes that there are always an odd number of underscores.





          # Load libraries
          library(stringr)

          # Define function
          even_split <- function(s){
          # Split string
          tmp <- str_split(s, "_")

          lapply(tmp, function(x){
          # Patch string back together in two pieces
          c(paste(x[1:(length(x)/2)], collapse = "_"),
          paste(x[(1+length(x)/2):length(x)], collapse = "_"))
          })
          }

          # Example
          strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

          # Test function
          even_split(strings)
          #> [[1]]
          #> [1] "aa_bb_cc" "dd_ee_ff"
          #>
          #> [[2]]
          #> [1] "cc_hh" "ff_zz"
          #>
          #> [[3]]
          #> [1] "bb" "dd"


          Created on 2019-01-18 by the reprex package (v0.2.1)







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 18 at 19:19









          LyngbakrLyngbakr

          4,63311325




          4,63311325

























              2














              Adapting nhahtdh's answer here, all you need to do is add a step to count the underscores (done here with str_count) and return the median number of underscores.



              library(stringr)

              strsplit(
              strings,
              paste0("^[^_]*(?:_[^_]*){", str_count(strings, '_') %/% 2, "}\K_"),
              perl = TRUE)

              # [[1]]
              # [1] "aa_bb_cc" "dd_ee_ff"
              #
              # [[2]]
              # [1] "cc_hh" "ff_zz"
              #
              # [[3]]
              # [1] "bb" "dd"





              share|improve this answer






























                2














                Adapting nhahtdh's answer here, all you need to do is add a step to count the underscores (done here with str_count) and return the median number of underscores.



                library(stringr)

                strsplit(
                strings,
                paste0("^[^_]*(?:_[^_]*){", str_count(strings, '_') %/% 2, "}\K_"),
                perl = TRUE)

                # [[1]]
                # [1] "aa_bb_cc" "dd_ee_ff"
                #
                # [[2]]
                # [1] "cc_hh" "ff_zz"
                #
                # [[3]]
                # [1] "bb" "dd"





                share|improve this answer




























                  2












                  2








                  2







                  Adapting nhahtdh's answer here, all you need to do is add a step to count the underscores (done here with str_count) and return the median number of underscores.



                  library(stringr)

                  strsplit(
                  strings,
                  paste0("^[^_]*(?:_[^_]*){", str_count(strings, '_') %/% 2, "}\K_"),
                  perl = TRUE)

                  # [[1]]
                  # [1] "aa_bb_cc" "dd_ee_ff"
                  #
                  # [[2]]
                  # [1] "cc_hh" "ff_zz"
                  #
                  # [[3]]
                  # [1] "bb" "dd"





                  share|improve this answer















                  Adapting nhahtdh's answer here, all you need to do is add a step to count the underscores (done here with str_count) and return the median number of underscores.



                  library(stringr)

                  strsplit(
                  strings,
                  paste0("^[^_]*(?:_[^_]*){", str_count(strings, '_') %/% 2, "}\K_"),
                  perl = TRUE)

                  # [[1]]
                  # [1] "aa_bb_cc" "dd_ee_ff"
                  #
                  # [[2]]
                  # [1] "cc_hh" "ff_zz"
                  #
                  # [[3]]
                  # [1] "bb" "dd"






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Jan 18 at 20:28

























                  answered Jan 18 at 19:44









                  IceCreamToucanIceCreamToucan

                  9,3161816




                  9,3161816























                      1














                      This assumes an odd number of underscores, and 99 or fewer.



                      library(stringr)
                      library(strex)
                      strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

                      splitMiddleUnderscore <- function(x){
                      nUnderscore <- str_count(x, '_')
                      middleUnderscore <- match(nUnderscore, seq(1, 99, 2))
                      str1 <- str_before_nth(x, '_', middleUnderscore)
                      str2 <- str_after_nth(x, '_', middleUnderscore)
                      c(str1, str2)
                      }

                      lapply(strings, splitMiddleUnderscore)

                      #[[1]]
                      #[1] "aa_bb_cc" "dd_ee_ff"

                      #[[2]]
                      #[1] "cc_hh" "ff_zz"

                      #[[3]]
                      #[1] "bb" "dd"





                      share|improve this answer



















                      • 1





                        you can use middleUnderscore <- str_count(x, '_') %/% 2 + 1 to avoid the "99 or fewer" requirement.

                        – IceCreamToucan
                        Jan 18 at 20:11


















                      1














                      This assumes an odd number of underscores, and 99 or fewer.



                      library(stringr)
                      library(strex)
                      strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

                      splitMiddleUnderscore <- function(x){
                      nUnderscore <- str_count(x, '_')
                      middleUnderscore <- match(nUnderscore, seq(1, 99, 2))
                      str1 <- str_before_nth(x, '_', middleUnderscore)
                      str2 <- str_after_nth(x, '_', middleUnderscore)
                      c(str1, str2)
                      }

                      lapply(strings, splitMiddleUnderscore)

                      #[[1]]
                      #[1] "aa_bb_cc" "dd_ee_ff"

                      #[[2]]
                      #[1] "cc_hh" "ff_zz"

                      #[[3]]
                      #[1] "bb" "dd"





                      share|improve this answer



















                      • 1





                        you can use middleUnderscore <- str_count(x, '_') %/% 2 + 1 to avoid the "99 or fewer" requirement.

                        – IceCreamToucan
                        Jan 18 at 20:11
















                      1












                      1








                      1







                      This assumes an odd number of underscores, and 99 or fewer.



                      library(stringr)
                      library(strex)
                      strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

                      splitMiddleUnderscore <- function(x){
                      nUnderscore <- str_count(x, '_')
                      middleUnderscore <- match(nUnderscore, seq(1, 99, 2))
                      str1 <- str_before_nth(x, '_', middleUnderscore)
                      str2 <- str_after_nth(x, '_', middleUnderscore)
                      c(str1, str2)
                      }

                      lapply(strings, splitMiddleUnderscore)

                      #[[1]]
                      #[1] "aa_bb_cc" "dd_ee_ff"

                      #[[2]]
                      #[1] "cc_hh" "ff_zz"

                      #[[3]]
                      #[1] "bb" "dd"





                      share|improve this answer













                      This assumes an odd number of underscores, and 99 or fewer.



                      library(stringr)
                      library(strex)
                      strings <- c('aa_bb_cc_dd_ee_ff', 'cc_hh_ff_zz', 'bb_dd')

                      splitMiddleUnderscore <- function(x){
                      nUnderscore <- str_count(x, '_')
                      middleUnderscore <- match(nUnderscore, seq(1, 99, 2))
                      str1 <- str_before_nth(x, '_', middleUnderscore)
                      str2 <- str_after_nth(x, '_', middleUnderscore)
                      c(str1, str2)
                      }

                      lapply(strings, splitMiddleUnderscore)

                      #[[1]]
                      #[1] "aa_bb_cc" "dd_ee_ff"

                      #[[2]]
                      #[1] "cc_hh" "ff_zz"

                      #[[3]]
                      #[1] "bb" "dd"






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Jan 18 at 19:36









                      Bill O'BrienBill O'Brien

                      576




                      576








                      • 1





                        you can use middleUnderscore <- str_count(x, '_') %/% 2 + 1 to avoid the "99 or fewer" requirement.

                        – IceCreamToucan
                        Jan 18 at 20:11
















                      • 1





                        you can use middleUnderscore <- str_count(x, '_') %/% 2 + 1 to avoid the "99 or fewer" requirement.

                        – IceCreamToucan
                        Jan 18 at 20:11










                      1




                      1





                      you can use middleUnderscore <- str_count(x, '_') %/% 2 + 1 to avoid the "99 or fewer" requirement.

                      – IceCreamToucan
                      Jan 18 at 20:11







                      you can use middleUnderscore <- str_count(x, '_') %/% 2 + 1 to avoid the "99 or fewer" requirement.

                      – IceCreamToucan
                      Jan 18 at 20:11




















                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54259953%2fhow-to-count-the-number-of-underscores-and-split-the-string-on-the-middle-one-on%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How fix org.hibernate.TransientPropertyValueException

                      Updating UILabel text programmatically using a function

                      Cloud Functions - OpenCV Videocapture Read method fails for larger files from cloud storage