How to detect specific subwords in textHow to validate an email address in JavaScript?Is there a regular expression to detect a valid regular expression?How to validate an email address using a regular expression?How do you access the matched groups in a JavaScript regular expression?How do you use a variable in a regular expression?How do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string in JavaScriptHow to negate specific word in regex?Bash command in php script, get some lines of a file according values of a specific columnRegEx match sequence of three strings along with text inbetween

Stereotypical names

How can a jailer prevent the Forge Cleric's Artisan's Blessing from being used?

Calculating the number of days between 2 dates in Excel

What if somebody invests in my application?

Reply ‘no position’ while the job posting is still there (‘HiWi’ position in Germany)

What is the opposite of 'gravitas'?

Would it be legal for a US State to ban exports of a natural resource?

Why are on-board computers allowed to change controls without notifying the pilots?

Could solar power be utilized and substitute coal in the 19th century?

Invariance of results when scaling explanatory variables in logistic regression, is there a proof?

Should a half Jewish man be discouraged from marrying a Jewess?

How to prevent YouTube from showing already watched videos?

What was required to accept "troll"?

How can I successfully establish a nationwide combat training program for a large country?

Science Fiction story where a man invents a machine that can help him watch history unfold

The One-Electron Universe postulate is true - what simple change can I make to change the whole universe?

Why does this part of the Space Shuttle launch pad seem to be floating in air?

Giant Toughroad SLR 2 for 200 miles in two days, will it make it?

Why are all the doors on Ferenginar (the Ferengi home world) far shorter than the average Ferengi?

Installing PowerShell on 32-bit Kali OS fails

The most efficient algorithm to find all possible integer pairs which sum to a given integer

Is infinity mathematically observable?

How do I repair my stair bannister?

Is there an Impartial Brexit Deal comparison site?



How to detect specific subwords in text


How to validate an email address in JavaScript?Is there a regular expression to detect a valid regular expression?How to validate an email address using a regular expression?How do you access the matched groups in a JavaScript regular expression?How do you use a variable in a regular expression?How do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string in JavaScriptHow to negate specific word in regex?Bash command in php script, get some lines of a file according values of a specific columnRegEx match sequence of three strings along with text inbetween













1















I have a column as a string with no spaces:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
end


I am using the following command:



gen = regex,(var, "(news)")


This outputs 1 1 1 because it finds that the 3 rows in the column var contain the word news.



I'm trying to alter the regular expression "(news)" to create two columns. One for news and one for newspaper. regexm(var, "(newspaper)") makes sure that the row contains a newspaper, but I need a command to make sure characters after news are not "paper" as I'm trying to quantify the two.




EDIT:



Is there a way to count the third entry as 1, because it has a news occurrence without however being a newspaper?










share|improve this question



















  • 1





    The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

    – Nick Cox
    Mar 7 at 13:27















1















I have a column as a string with no spaces:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
end


I am using the following command:



gen = regex,(var, "(news)")


This outputs 1 1 1 because it finds that the 3 rows in the column var contain the word news.



I'm trying to alter the regular expression "(news)" to create two columns. One for news and one for newspaper. regexm(var, "(newspaper)") makes sure that the row contains a newspaper, but I need a command to make sure characters after news are not "paper" as I'm trying to quantify the two.




EDIT:



Is there a way to count the third entry as 1, because it has a news occurrence without however being a newspaper?










share|improve this question



















  • 1





    The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

    – Nick Cox
    Mar 7 at 13:27













1












1








1








I have a column as a string with no spaces:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
end


I am using the following command:



gen = regex,(var, "(news)")


This outputs 1 1 1 because it finds that the 3 rows in the column var contain the word news.



I'm trying to alter the regular expression "(news)" to create two columns. One for news and one for newspaper. regexm(var, "(newspaper)") makes sure that the row contains a newspaper, but I need a command to make sure characters after news are not "paper" as I'm trying to quantify the two.




EDIT:



Is there a way to count the third entry as 1, because it has a news occurrence without however being a newspaper?










share|improve this question
















I have a column as a string with no spaces:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
end


I am using the following command:



gen = regex,(var, "(news)")


This outputs 1 1 1 because it finds that the 3 rows in the column var contain the word news.



I'm trying to alter the regular expression "(news)" to create two columns. One for news and one for newspaper. regexm(var, "(newspaper)") makes sure that the row contains a newspaper, but I need a command to make sure characters after news are not "paper" as I'm trying to quantify the two.




EDIT:



Is there a way to count the third entry as 1, because it has a news occurrence without however being a newspaper?







regex stata






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 7 at 20:24









Pearly Spencer

12k173968




12k173968










asked Mar 7 at 10:18









sammyramzsammyramz

13619




13619







  • 1





    The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

    – Nick Cox
    Mar 7 at 13:27












  • 1





    The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

    – Nick Cox
    Mar 7 at 13:27







1




1





The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

– Nick Cox
Mar 7 at 13:27





The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

– Nick Cox
Mar 7 at 13:27












1 Answer
1






active

oldest

votes


















3














You can quantify as follows without a regular expression:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
"fdgdnews"
"fgogodigjhoigjnewspaper"
"fgeogeionnewsfgdgfpaper"
"45pap9358newsfjfgni"
end

generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

list, separator(0)

+----------------------------------------+
| var news |
|----------------------------------------|
1. | ihaveanewspaper 0 |
2. | watchingthenewsonthetv 1 |
3. | watchthenewsandreadthenewspaper 0 |
4. | fdgdnews 1 |
5. | fgogodigjhoigjnewspaper 0 |
6. | fgeogeionnewsfgdgfpaper 1 |
7. | 45pap9358newsfjfgni 1 |
+----------------------------------------+

count if news
4

count if !news
3



EDIT:



One way to do this is to eliminate all instances of the word newspaper and repeat the process:



generate var2 = subinstr(var, "newspaper", "", .)
replace news = 1 if strmatch(var2, "*news*")

list, separator(0)

+------------------------------------------------------------------+
| var news var2 |
|------------------------------------------------------------------|
1. | ihaveanewspaper 0 ihavea |
2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
4. | fdgdnews 1 fdgdnews |
5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
+------------------------------------------------------------------+

count if news
5

count if !news
2





share|improve this answer
























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55041300%2fhow-to-detect-specific-subwords-in-text%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3














    You can quantify as follows without a regular expression:



    clear
    input str100 var
    "ihaveanewspaper"
    "watchingthenewsonthetv"
    "watchthenewsandreadthenewspaper"
    "fdgdnews"
    "fgogodigjhoigjnewspaper"
    "fgeogeionnewsfgdgfpaper"
    "45pap9358newsfjfgni"
    end

    generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

    list, separator(0)

    +----------------------------------------+
    | var news |
    |----------------------------------------|
    1. | ihaveanewspaper 0 |
    2. | watchingthenewsonthetv 1 |
    3. | watchthenewsandreadthenewspaper 0 |
    4. | fdgdnews 1 |
    5. | fgogodigjhoigjnewspaper 0 |
    6. | fgeogeionnewsfgdgfpaper 1 |
    7. | 45pap9358newsfjfgni 1 |
    +----------------------------------------+

    count if news
    4

    count if !news
    3



    EDIT:



    One way to do this is to eliminate all instances of the word newspaper and repeat the process:



    generate var2 = subinstr(var, "newspaper", "", .)
    replace news = 1 if strmatch(var2, "*news*")

    list, separator(0)

    +------------------------------------------------------------------+
    | var news var2 |
    |------------------------------------------------------------------|
    1. | ihaveanewspaper 0 ihavea |
    2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
    3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
    4. | fdgdnews 1 fdgdnews |
    5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
    6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
    7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
    +------------------------------------------------------------------+

    count if news
    5

    count if !news
    2





    share|improve this answer





























      3














      You can quantify as follows without a regular expression:



      clear
      input str100 var
      "ihaveanewspaper"
      "watchingthenewsonthetv"
      "watchthenewsandreadthenewspaper"
      "fdgdnews"
      "fgogodigjhoigjnewspaper"
      "fgeogeionnewsfgdgfpaper"
      "45pap9358newsfjfgni"
      end

      generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

      list, separator(0)

      +----------------------------------------+
      | var news |
      |----------------------------------------|
      1. | ihaveanewspaper 0 |
      2. | watchingthenewsonthetv 1 |
      3. | watchthenewsandreadthenewspaper 0 |
      4. | fdgdnews 1 |
      5. | fgogodigjhoigjnewspaper 0 |
      6. | fgeogeionnewsfgdgfpaper 1 |
      7. | 45pap9358newsfjfgni 1 |
      +----------------------------------------+

      count if news
      4

      count if !news
      3



      EDIT:



      One way to do this is to eliminate all instances of the word newspaper and repeat the process:



      generate var2 = subinstr(var, "newspaper", "", .)
      replace news = 1 if strmatch(var2, "*news*")

      list, separator(0)

      +------------------------------------------------------------------+
      | var news var2 |
      |------------------------------------------------------------------|
      1. | ihaveanewspaper 0 ihavea |
      2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
      3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
      4. | fdgdnews 1 fdgdnews |
      5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
      6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
      7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
      +------------------------------------------------------------------+

      count if news
      5

      count if !news
      2





      share|improve this answer



























        3












        3








        3







        You can quantify as follows without a regular expression:



        clear
        input str100 var
        "ihaveanewspaper"
        "watchingthenewsonthetv"
        "watchthenewsandreadthenewspaper"
        "fdgdnews"
        "fgogodigjhoigjnewspaper"
        "fgeogeionnewsfgdgfpaper"
        "45pap9358newsfjfgni"
        end

        generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

        list, separator(0)

        +----------------------------------------+
        | var news |
        |----------------------------------------|
        1. | ihaveanewspaper 0 |
        2. | watchingthenewsonthetv 1 |
        3. | watchthenewsandreadthenewspaper 0 |
        4. | fdgdnews 1 |
        5. | fgogodigjhoigjnewspaper 0 |
        6. | fgeogeionnewsfgdgfpaper 1 |
        7. | 45pap9358newsfjfgni 1 |
        +----------------------------------------+

        count if news
        4

        count if !news
        3



        EDIT:



        One way to do this is to eliminate all instances of the word newspaper and repeat the process:



        generate var2 = subinstr(var, "newspaper", "", .)
        replace news = 1 if strmatch(var2, "*news*")

        list, separator(0)

        +------------------------------------------------------------------+
        | var news var2 |
        |------------------------------------------------------------------|
        1. | ihaveanewspaper 0 ihavea |
        2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
        3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
        4. | fdgdnews 1 fdgdnews |
        5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
        6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
        7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
        +------------------------------------------------------------------+

        count if news
        5

        count if !news
        2





        share|improve this answer















        You can quantify as follows without a regular expression:



        clear
        input str100 var
        "ihaveanewspaper"
        "watchingthenewsonthetv"
        "watchthenewsandreadthenewspaper"
        "fdgdnews"
        "fgogodigjhoigjnewspaper"
        "fgeogeionnewsfgdgfpaper"
        "45pap9358newsfjfgni"
        end

        generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

        list, separator(0)

        +----------------------------------------+
        | var news |
        |----------------------------------------|
        1. | ihaveanewspaper 0 |
        2. | watchingthenewsonthetv 1 |
        3. | watchthenewsandreadthenewspaper 0 |
        4. | fdgdnews 1 |
        5. | fgogodigjhoigjnewspaper 0 |
        6. | fgeogeionnewsfgdgfpaper 1 |
        7. | 45pap9358newsfjfgni 1 |
        +----------------------------------------+

        count if news
        4

        count if !news
        3



        EDIT:



        One way to do this is to eliminate all instances of the word newspaper and repeat the process:



        generate var2 = subinstr(var, "newspaper", "", .)
        replace news = 1 if strmatch(var2, "*news*")

        list, separator(0)

        +------------------------------------------------------------------+
        | var news var2 |
        |------------------------------------------------------------------|
        1. | ihaveanewspaper 0 ihavea |
        2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
        3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
        4. | fdgdnews 1 fdgdnews |
        5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
        6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
        7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
        +------------------------------------------------------------------+

        count if news
        5

        count if !news
        2






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 7 at 20:22

























        answered Mar 7 at 10:50









        Pearly SpencerPearly Spencer

        12k173968




        12k173968





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55041300%2fhow-to-detect-specific-subwords-in-text%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Save data to MySQL database using ExtJS and PHP [closed]2019 Community Moderator ElectionHow can I prevent SQL injection in PHP?Which MySQL data type to use for storing boolean valuesPHP: Delete an element from an arrayHow do I connect to a MySQL Database in Python?Should I use the datetime or timestamp data type in MySQL?How to get a list of MySQL user accountsHow Do You Parse and Process HTML/XML in PHP?Reference — What does this symbol mean in PHP?How does PHP 'foreach' actually work?Why shouldn't I use mysql_* functions in PHP?

            Compiling GNU Global with universal-ctags support Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Tags for Emacs: Relationship between etags, ebrowse, cscope, GNU Global and exuberant ctagsVim and Ctags tips and trickscscope or ctags why choose one over the other?scons and ctagsctags cannot open option file “.ctags”Adding tag scopes in universal-ctagsShould I use Universal-ctags?Universal ctags on WindowsHow do I install GNU Global with universal ctags support using Homebrew?Universal ctags with emacsHow to highlight ctags generated by Universal Ctags in Vim?

            Add ONERROR event to image from jsp tldHow to add an image to a JPanel?Saving image from PHP URLHTML img scalingCheck if an image is loaded (no errors) with jQueryHow to force an <img> to take up width, even if the image is not loadedHow do I populate hidden form field with a value set in Spring ControllerStyling Raw elements Generated from JSP tagds with Jquery MobileLimit resizing of images with explicitly set width and height attributeserror TLD use in a jsp fileJsp tld files cannot be resolved