gsub / sub to extract between certain charactersNon greedy (reluctant) regex matching in sed?How to extract a substring using regexFind and extract a number from a stringExtract strings with Regex in gsubUsing gsub in Rgsub regex not matching in Rgsub R extracting stringExtract only number between commasExtract only first appearance of number after gsubExtracting parts of text string between two characters

How to prevent "they're falling in love" trope

Why was the shrinking from 8″ made only to 5.25″ and not smaller (4″ or less)?

iPad being using in wall mount battery swollen

Examples of smooth manifolds admitting inbetween one and a continuum of complex structures

What exploit Are these user agents trying to use?

Expand and Contract

I would say: "You are another teacher", but she is a woman and I am a man

Cursor Replacement for Newbies

Why can't we play rap on piano?

Is it logically or scientifically possible to artificially send energy to the body?

Alternative to sending password over mail?

Why are the 737's rear doors unusable in a water landing?

Bullying boss launched a smear campaign and made me unemployable

Forgetting the musical notes while performing in concert

Is there an expression that means doing something right before you will need it rather than doing it in case you might need it?

How dangerous is XSS?

Im going to France and my passport expires June 19th

Do UK voters know if their MP will be the Speaker of the House?

Unlock My Phone! February 2018

One verb to replace 'be a member of' a club

What are some good books on Machine Learning and AI like Krugman, Wells and Graddy's "Essentials of Economics"

How could indestructible materials be used in power generation?

Personal Teleportation: From Rags to Riches

Venezuelan girlfriend wants to travel the USA to be with me. What is the process?



gsub / sub to extract between certain characters


Non greedy (reluctant) regex matching in sed?How to extract a substring using regexFind and extract a number from a stringExtract strings with Regex in gsubUsing gsub in Rgsub regex not matching in Rgsub R extracting stringExtract only number between commasExtract only first appearance of number after gsubExtracting parts of text string between two characters













1















How can I extract the numbers / ID from the following string in R?



link <- "D:/temp/sample_data/0000098618-13-000011.htm"



I want to just extract 0000098618-13-000011



That is discard the .htm and the D:/temp/sample_data/.



I have tried grep and gsub without much luck.










share|improve this question




























    1















    How can I extract the numbers / ID from the following string in R?



    link <- "D:/temp/sample_data/0000098618-13-000011.htm"



    I want to just extract 0000098618-13-000011



    That is discard the .htm and the D:/temp/sample_data/.



    I have tried grep and gsub without much luck.










    share|improve this question


























      1












      1








      1








      How can I extract the numbers / ID from the following string in R?



      link <- "D:/temp/sample_data/0000098618-13-000011.htm"



      I want to just extract 0000098618-13-000011



      That is discard the .htm and the D:/temp/sample_data/.



      I have tried grep and gsub without much luck.










      share|improve this question
















      How can I extract the numbers / ID from the following string in R?



      link <- "D:/temp/sample_data/0000098618-13-000011.htm"



      I want to just extract 0000098618-13-000011



      That is discard the .htm and the D:/temp/sample_data/.



      I have tried grep and gsub without much luck.







      r regex






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 7 at 22:35









      d.b

      20.5k41949




      20.5k41949










      asked Mar 7 at 22:34









      user8959427user8959427

      1288




      1288






















          2 Answers
          2






          active

          oldest

          votes


















          3














          1) basename Use basename followed by sub:



          sub("\..*", "", basename(link))
          ## [1] "0000098618-13-000011"


          2) file_path_sans_ext



          library(tools)
          file_path_sans_ext(link)
          ## [1] "0000098618-13-000011"


          3) sub



          sub(".*/(.*)\..*", "\1", link)
          ## [1] "0000098618-13-000011"


          4) gsub



          gsub(".*/|\.[^.]*$", "", link)
          ## [1] "0000098618-13-000011"


          5) strsplit



          sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
          ## [1] "0000098618-13-000011"


          6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



          DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
          DF[[ncol(DF)]]
          ## [1] "0000098618-13-000011"





          share|improve this answer

























          • Thanks! what does the basename function do?

            – user8959427
            Mar 7 at 22:36











          • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

            – r2evans
            Mar 7 at 22:37












          • It Works on the whole data set but basename is new to me.

            – user8959427
            Mar 7 at 22:38






          • 1





            Look at ?basename.

            – neilfws
            Mar 7 at 22:38












          • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

            – markus
            Mar 7 at 22:44


















          0














          Using stringr:



          library(stringr)
          str_extract(link , "[0-9-]+")

          # "0000098618-13-000011"





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55053895%2fgsub-sub-to-extract-between-certain-characters%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            1) basename Use basename followed by sub:



            sub("\..*", "", basename(link))
            ## [1] "0000098618-13-000011"


            2) file_path_sans_ext



            library(tools)
            file_path_sans_ext(link)
            ## [1] "0000098618-13-000011"


            3) sub



            sub(".*/(.*)\..*", "\1", link)
            ## [1] "0000098618-13-000011"


            4) gsub



            gsub(".*/|\.[^.]*$", "", link)
            ## [1] "0000098618-13-000011"


            5) strsplit



            sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
            ## [1] "0000098618-13-000011"


            6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



            DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
            DF[[ncol(DF)]]
            ## [1] "0000098618-13-000011"





            share|improve this answer

























            • Thanks! what does the basename function do?

              – user8959427
              Mar 7 at 22:36











            • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

              – r2evans
              Mar 7 at 22:37












            • It Works on the whole data set but basename is new to me.

              – user8959427
              Mar 7 at 22:38






            • 1





              Look at ?basename.

              – neilfws
              Mar 7 at 22:38












            • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

              – markus
              Mar 7 at 22:44















            3














            1) basename Use basename followed by sub:



            sub("\..*", "", basename(link))
            ## [1] "0000098618-13-000011"


            2) file_path_sans_ext



            library(tools)
            file_path_sans_ext(link)
            ## [1] "0000098618-13-000011"


            3) sub



            sub(".*/(.*)\..*", "\1", link)
            ## [1] "0000098618-13-000011"


            4) gsub



            gsub(".*/|\.[^.]*$", "", link)
            ## [1] "0000098618-13-000011"


            5) strsplit



            sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
            ## [1] "0000098618-13-000011"


            6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



            DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
            DF[[ncol(DF)]]
            ## [1] "0000098618-13-000011"





            share|improve this answer

























            • Thanks! what does the basename function do?

              – user8959427
              Mar 7 at 22:36











            • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

              – r2evans
              Mar 7 at 22:37












            • It Works on the whole data set but basename is new to me.

              – user8959427
              Mar 7 at 22:38






            • 1





              Look at ?basename.

              – neilfws
              Mar 7 at 22:38












            • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

              – markus
              Mar 7 at 22:44













            3












            3








            3







            1) basename Use basename followed by sub:



            sub("\..*", "", basename(link))
            ## [1] "0000098618-13-000011"


            2) file_path_sans_ext



            library(tools)
            file_path_sans_ext(link)
            ## [1] "0000098618-13-000011"


            3) sub



            sub(".*/(.*)\..*", "\1", link)
            ## [1] "0000098618-13-000011"


            4) gsub



            gsub(".*/|\.[^.]*$", "", link)
            ## [1] "0000098618-13-000011"


            5) strsplit



            sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
            ## [1] "0000098618-13-000011"


            6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



            DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
            DF[[ncol(DF)]]
            ## [1] "0000098618-13-000011"





            share|improve this answer















            1) basename Use basename followed by sub:



            sub("\..*", "", basename(link))
            ## [1] "0000098618-13-000011"


            2) file_path_sans_ext



            library(tools)
            file_path_sans_ext(link)
            ## [1] "0000098618-13-000011"


            3) sub



            sub(".*/(.*)\..*", "\1", link)
            ## [1] "0000098618-13-000011"


            4) gsub



            gsub(".*/|\.[^.]*$", "", link)
            ## [1] "0000098618-13-000011"


            5) strsplit



            sapply(strsplit(link, "[/.]"), function(x) tail(x, 2)[1])
            ## [1] "0000098618-13-000011"


            6) read.table. If link is a vector this will only work if all elements have the same number of /-separated components. Also this assumes that the only dot is the one separting the extension.



            DF <- read.table(text = link, sep = "/", comment = ".", as.is = TRUE)
            DF[[ncol(DF)]]
            ## [1] "0000098618-13-000011"






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Mar 8 at 14:14

























            answered Mar 7 at 22:35









            G. GrothendieckG. Grothendieck

            153k10136244




            153k10136244












            • Thanks! what does the basename function do?

              – user8959427
              Mar 7 at 22:36











            • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

              – r2evans
              Mar 7 at 22:37












            • It Works on the whole data set but basename is new to me.

              – user8959427
              Mar 7 at 22:38






            • 1





              Look at ?basename.

              – neilfws
              Mar 7 at 22:38












            • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

              – markus
              Mar 7 at 22:44

















            • Thanks! what does the basename function do?

              – user8959427
              Mar 7 at 22:36











            • Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

              – r2evans
              Mar 7 at 22:37












            • It Works on the whole data set but basename is new to me.

              – user8959427
              Mar 7 at 22:38






            • 1





              Look at ?basename.

              – neilfws
              Mar 7 at 22:38












            • Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

              – markus
              Mar 7 at 22:44
















            Thanks! what does the basename function do?

            – user8959427
            Mar 7 at 22:36





            Thanks! what does the basename function do?

            – user8959427
            Mar 7 at 22:36













            Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

            – r2evans
            Mar 7 at 22:37






            Or just tools::file_path_sans_ext(basename("foo/bar/quux.html")).

            – r2evans
            Mar 7 at 22:37














            It Works on the whole data set but basename is new to me.

            – user8959427
            Mar 7 at 22:38





            It Works on the whole data set but basename is new to me.

            – user8959427
            Mar 7 at 22:38




            1




            1





            Look at ?basename.

            – neilfws
            Mar 7 at 22:38






            Look at ?basename.

            – neilfws
            Mar 7 at 22:38














            Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

            – markus
            Mar 7 at 22:44





            Or a basename - dirname combo: dirname(chartr(".", "/", basename(link)))

            – markus
            Mar 7 at 22:44













            0














            Using stringr:



            library(stringr)
            str_extract(link , "[0-9-]+")

            # "0000098618-13-000011"





            share|improve this answer



























              0














              Using stringr:



              library(stringr)
              str_extract(link , "[0-9-]+")

              # "0000098618-13-000011"





              share|improve this answer

























                0












                0








                0







                Using stringr:



                library(stringr)
                str_extract(link , "[0-9-]+")

                # "0000098618-13-000011"





                share|improve this answer













                Using stringr:



                library(stringr)
                str_extract(link , "[0-9-]+")

                # "0000098618-13-000011"






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 8 at 14:23









                sindri_baldursindri_baldur

                8,3651033




                8,3651033



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55053895%2fgsub-sub-to-extract-between-certain-characters%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Save data to MySQL database using ExtJS and PHP [closed]2019 Community Moderator ElectionHow can I prevent SQL injection in PHP?Which MySQL data type to use for storing boolean valuesPHP: Delete an element from an arrayHow do I connect to a MySQL Database in Python?Should I use the datetime or timestamp data type in MySQL?How to get a list of MySQL user accountsHow Do You Parse and Process HTML/XML in PHP?Reference — What does this symbol mean in PHP?How does PHP 'foreach' actually work?Why shouldn't I use mysql_* functions in PHP?

                    Compiling GNU Global with universal-ctags support Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Tags for Emacs: Relationship between etags, ebrowse, cscope, GNU Global and exuberant ctagsVim and Ctags tips and trickscscope or ctags why choose one over the other?scons and ctagsctags cannot open option file “.ctags”Adding tag scopes in universal-ctagsShould I use Universal-ctags?Universal ctags on WindowsHow do I install GNU Global with universal ctags support using Homebrew?Universal ctags with emacsHow to highlight ctags generated by Universal Ctags in Vim?

                    Add ONERROR event to image from jsp tldHow to add an image to a JPanel?Saving image from PHP URLHTML img scalingCheck if an image is loaded (no errors) with jQueryHow to force an <img> to take up width, even if the image is not loadedHow do I populate hidden form field with a value set in Spring ControllerStyling Raw elements Generated from JSP tagds with Jquery MobileLimit resizing of images with explicitly set width and height attributeserror TLD use in a jsp fileJsp tld files cannot be resolved