gsub / sub to extract between certain charactersNon greedy (reluctant) regex matching in sed?How to extract a substring using regexFind and extract a number from a stringExtract strings with Regex in gsubUsing gsub in Rgsub regex not matching in Rgsub R extracting stringExtract only number between commasExtract only first appearance of number after gsubExtracting parts of text string between two characters

How to prevent "they're falling in love" trope

Why was the shrinking from 8″ made only to 5.25″ and not smaller (4″ or less)?

iPad being using in wall mount battery swollen

Examples of smooth manifolds admitting inbetween one and a continuum of complex structures

What exploit Are these user agents trying to use?

Expand and Contract

I would say: "You are another teacher", but she is a woman and I am a man

Cursor Replacement for Newbies

Why can't we play rap on piano?

Is it logically or scientifically possible to artificially send energy to the body?

Alternative to sending password over mail?

Why are the 737's rear doors unusable in a water landing?

Bullying boss launched a smear campaign and made me unemployable

Forgetting the musical notes while performing in concert

Is there an expression that means doing something right before you will need it rather than doing it in case you might need it?

How dangerous is XSS?

Im going to France and my passport expires June 19th

Do UK voters know if their MP will be the Speaker of the House?

Unlock My Phone! February 2018

One verb to replace 'be a member of' a club

What are some good books on Machine Learning and AI like Krugman, Wells and Graddy's "Essentials of Economics"

How could indestructible materials be used in power generation?

Personal Teleportation: From Rags to Riches

Venezuelan girlfriend wants to travel the USA to be with me. What is the process?



gsub / sub to extract between certain characters


Non greedy (reluctant) regex matching in sed?How to extract a substring using regexFind and extract a number from a stringExtract strings with Regex in gsubUsing gsub in Rgsub regex not matching in Rgsub R extracting stringExtract only number between commasExtract only first appearance of number after gsubExtracting parts of text string between two characters













1















How can I extract the numbers / ID from the following string in R?



link <- "D:/temp/sample_data/0000098618-13-000011.htm"



I want to just extract 0000098618-13-000011



That is discard the .htm and the D:/temp/sample_data/.



I have tried grep and gsub without much luck.










share|improve this question




























    1















    How can I extract the numbers / ID from the following string in R?



    link <- "D:/temp/sample_data/0000098618-13-000011.htm"



    I want to just extract 0000098618-13-000011



    That is discard the .htm and the D:/temp/sample_data/.



    I have tried grep and gsub without much luck.










    share|improve this question


























      1












      1








      1








      How can I extract the numbers / ID from the following string in R?



      link <- "D:/temp/sample_data/0000098618-13-000011.htm"



      I want to just extract 0000098618-13-000011



      That is discard the .htm and the D:/temp/sample_data/.



      I have tried grep and gsub without much luck.










      share|improve this question
















      How can I extract the numbers / ID from the following string in R?



      link <- "D:/temp/sample_data/0000098618-13-000011.htm"



      I want to just extract 0000098618-13-000011



      That is discard the .htm and the D:/temp/sample_data/.



      I have tried grep and gsub without much luck.







      r regex






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 7 at 22:35









      d.b

      20.5k41949




      20.5k41949










      asked Mar 7 at 22:34









      user8959427user8959427

      1288




      1288






















          2 Answers
          2






          active

          oldest

          votes


















          3














          1) basename Use basename followed by sub:



          sub("\..*", "", basename(link))
          ## [1] "0000098618-13-000011"


          2) file_path_sans_ext



          library(tools)
          file_path_sans_ext(link)
          ## [1] "0000098618-13-000011"


          3) sub



          sub(".*/(.*)\..*", "\1", link)
          ## [1] "0000098618-13-000011"


          4) gsub



          gsub(".*/|\.[^.]*$", "", link)
          ## [1] "0000098618-13-000011"


          5) strsplit



          sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
          ## [1] "0000098618-13-000011"


          6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



          DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
          DF[[ncol(DF)]]
          ## [1] "0000098618-13-000011"





          share|improve this answer

























          • Thanks! what does the basename function do?

            – user8959427
            Mar 7 at 22:36











          • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

            – r2evans
            Mar 7 at 22:37












          • It Works on the whole data set but basename is new to me.

            – user8959427
            Mar 7 at 22:38






          • 1





            Look at ?basename.

            – neilfws
            Mar 7 at 22:38












          • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

            – markus
            Mar 7 at 22:44


















          0














          Using stringr:



          library(stringr)
          str_extract(link , "[0-9-]+")

          # "0000098618-13-000011"





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55053895%2fgsub-sub-to-extract-between-certain-characters%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            1) basename Use basename followed by sub:



            sub("\..*", "", basename(link))
            ## [1] "0000098618-13-000011"


            2) file_path_sans_ext



            library(tools)
            file_path_sans_ext(link)
            ## [1] "0000098618-13-000011"


            3) sub



            sub(".*/(.*)\..*", "\1", link)
            ## [1] "0000098618-13-000011"


            4) gsub



            gsub(".*/|\.[^.]*$", "", link)
            ## [1] "0000098618-13-000011"


            5) strsplit



            sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
            ## [1] "0000098618-13-000011"


            6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



            DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
            DF[[ncol(DF)]]
            ## [1] "0000098618-13-000011"





            share|improve this answer

























            • Thanks! what does the basename function do?

              – user8959427
              Mar 7 at 22:36











            • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

              – r2evans
              Mar 7 at 22:37












            • It Works on the whole data set but basename is new to me.

              – user8959427
              Mar 7 at 22:38






            • 1





              Look at ?basename.

              – neilfws
              Mar 7 at 22:38












            • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

              – markus
              Mar 7 at 22:44















            3














            1) basename Use basename followed by sub:



            sub("\..*", "", basename(link))
            ## [1] "0000098618-13-000011"


            2) file_path_sans_ext



            library(tools)
            file_path_sans_ext(link)
            ## [1] "0000098618-13-000011"


            3) sub



            sub(".*/(.*)\..*", "\1", link)
            ## [1] "0000098618-13-000011"


            4) gsub



            gsub(".*/|\.[^.]*$", "", link)
            ## [1] "0000098618-13-000011"


            5) strsplit



            sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
            ## [1] "0000098618-13-000011"


            6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



            DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
            DF[[ncol(DF)]]
            ## [1] "0000098618-13-000011"





            share|improve this answer

























            • Thanks! what does the basename function do?

              – user8959427
              Mar 7 at 22:36











            • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

              – r2evans
              Mar 7 at 22:37












            • It Works on the whole data set but basename is new to me.

              – user8959427
              Mar 7 at 22:38






            • 1





              Look at ?basename.

              – neilfws
              Mar 7 at 22:38












            • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

              – markus
              Mar 7 at 22:44













            3












            3








            3







            1) basename Use basename followed by sub:



            sub("\..*", "", basename(link))
            ## [1] "0000098618-13-000011"


            2) file_path_sans_ext



            library(tools)
            file_path_sans_ext(link)
            ## [1] "0000098618-13-000011"


            3) sub



            sub(".*/(.*)\..*", "\1", link)
            ## [1] "0000098618-13-000011"


            4) gsub



            gsub(".*/|\.[^.]*$", "", link)
            ## [1] "0000098618-13-000011"


            5) strsplit



            sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
            ## [1] "0000098618-13-000011"


            6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



            DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
            DF[[ncol(DF)]]
            ## [1] "0000098618-13-000011"





            share|improve this answer















            1) basename Use basename followed by sub:



            sub("\..*", "", basename(link))
            ## [1] "0000098618-13-000011"


            2) file_path_sans_ext



            library(tools)
            file_path_sans_ext(link)
            ## [1] "0000098618-13-000011"


            3) sub



            sub(".*/(.*)\..*", "\1", link)
            ## [1] "0000098618-13-000011"


            4) gsub



            gsub(".*/|\.[^.]*$", "", link)
            ## [1] "0000098618-13-000011"


            5) strsplit



            sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
            ## [1] "0000098618-13-000011"


            6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



            DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
            DF[[ncol(DF)]]
            ## [1] "0000098618-13-000011"






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Mar 8 at 14:14

























            answered Mar 7 at 22:35









            G. GrothendieckG. Grothendieck

            153k10136244




            153k10136244












            • Thanks! what does the basename function do?

              – user8959427
              Mar 7 at 22:36











            • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

              – r2evans
              Mar 7 at 22:37












            • It Works on the whole data set but basename is new to me.

              – user8959427
              Mar 7 at 22:38






            • 1





              Look at ?basename.

              – neilfws
              Mar 7 at 22:38












            • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

              – markus
              Mar 7 at 22:44

















            • Thanks! what does the basename function do?

              – user8959427
              Mar 7 at 22:36











            • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

              – r2evans
              Mar 7 at 22:37












            • It Works on the whole data set but basename is new to me.

              – user8959427
              Mar 7 at 22:38






            • 1





              Look at ?basename.

              – neilfws
              Mar 7 at 22:38












            • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

              – markus
              Mar 7 at 22:44
















            Thanks! what does the basename function do?

            – user8959427
            Mar 7 at 22:36





            Thanks! what does the basename function do?

            – user8959427
            Mar 7 at 22:36













            Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

            – r2evans
            Mar 7 at 22:37






            Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

            – r2evans
            Mar 7 at 22:37














            It Works on the whole data set but basename is new to me.

            – user8959427
            Mar 7 at 22:38





            It Works on the whole data set but basename is new to me.

            – user8959427
            Mar 7 at 22:38




            1




            1





            Look at ?basename.

            – neilfws
            Mar 7 at 22:38






            Look at ?basename.

            – neilfws
            Mar 7 at 22:38














            Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

            – markus
            Mar 7 at 22:44





            Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

            – markus
            Mar 7 at 22:44













            0














            Using stringr:



            library(stringr)
            str_extract(link , "[0-9-]+")

            # "0000098618-13-000011"





            share|improve this answer



























              0














              Using stringr:



              library(stringr)
              str_extract(link , "[0-9-]+")

              # "0000098618-13-000011"





              share|improve this answer

























                0












                0








                0







                Using stringr:



                library(stringr)
                str_extract(link , "[0-9-]+")

                # "0000098618-13-000011"





                share|improve this answer













                Using stringr:



                library(stringr)
                str_extract(link , "[0-9-]+")

                # "0000098618-13-000011"






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 8 at 14:23









                sindri_baldursindri_baldur

                8,3651033




                8,3651033



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55053895%2fgsub-sub-to-extract-between-certain-characters%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    1928 у кіно

                    Захаров Федір Захарович

                    Ель Греко