How to detect specific subwords in textHow to validate an email address in JavaScript?Is there a regular expression to detect a valid regular expression?How to validate an email address using a regular expression?How do you access the matched groups in a JavaScript regular expression?How do you use a variable in a regular expression?How do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string in JavaScriptHow to negate specific word in regex?Bash command in php script, get some lines of a file according values of a specific columnRegEx match sequence of three strings along with text inbetween

Stereotypical names

How can a jailer prevent the Forge Cleric's Artisan's Blessing from being used?

Calculating the number of days between 2 dates in Excel

What if somebody invests in my application?

Reply ‘no position’ while the job posting is still there (‘HiWi’ position in Germany)

What is the opposite of 'gravitas'?

Would it be legal for a US State to ban exports of a natural resource?

Why are on-board computers allowed to change controls without notifying the pilots?

Could solar power be utilized and substitute coal in the 19th century?

Invariance of results when scaling explanatory variables in logistic regression, is there a proof?

Should a half Jewish man be discouraged from marrying a Jewess?

How to prevent YouTube from showing already watched videos?

What was required to accept "troll"?

How can I successfully establish a nationwide combat training program for a large country?

Science Fiction story where a man invents a machine that can help him watch history unfold

The One-Electron Universe postulate is true - what simple change can I make to change the whole universe?

Why does this part of the Space Shuttle launch pad seem to be floating in air?

Giant Toughroad SLR 2 for 200 miles in two days, will it make it?

Why are all the doors on Ferenginar (the Ferengi home world) far shorter than the average Ferengi?

Installing PowerShell on 32-bit Kali OS fails

The most efficient algorithm to find all possible integer pairs which sum to a given integer

Is infinity mathematically observable?

How do I repair my stair bannister?

Is there an Impartial Brexit Deal comparison site?



How to detect specific subwords in text


How to validate an email address in JavaScript?Is there a regular expression to detect a valid regular expression?How to validate an email address using a regular expression?How do you access the matched groups in a JavaScript regular expression?How do you use a variable in a regular expression?How do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string in JavaScriptHow to negate specific word in regex?Bash command in php script, get some lines of a file according values of a specific columnRegEx match sequence of three strings along with text inbetween













1















I have a column as a string with no spaces:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
end


I am using the following command:



gen = regex,(var, "(news)")


This outputs 1 1 1 because it finds that the 3 rows in the column var contain the word news.



I'm trying to alter the regular expression "(news)" to create two columns. One for news and one for newspaper. regexm(var, "(newspaper)") makes sure that the row contains a newspaper, but I need a command to make sure characters after news are not "paper" as I'm trying to quantify the two.




EDIT:



Is there a way to count the third entry as 1, because it has a news occurrence without however being a newspaper?










share|improve this question



















  • 1





    The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

    – Nick Cox
    Mar 7 at 13:27















1















I have a column as a string with no spaces:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
end


I am using the following command:



gen = regex,(var, "(news)")


This outputs 1 1 1 because it finds that the 3 rows in the column var contain the word news.



I'm trying to alter the regular expression "(news)" to create two columns. One for news and one for newspaper. regexm(var, "(newspaper)") makes sure that the row contains a newspaper, but I need a command to make sure characters after news are not "paper" as I'm trying to quantify the two.




EDIT:



Is there a way to count the third entry as 1, because it has a news occurrence without however being a newspaper?










share|improve this question



















  • 1





    The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

    – Nick Cox
    Mar 7 at 13:27













1












1








1








I have a column as a string with no spaces:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
end


I am using the following command:



gen = regex,(var, "(news)")


This outputs 1 1 1 because it finds that the 3 rows in the column var contain the word news.



I'm trying to alter the regular expression "(news)" to create two columns. One for news and one for newspaper. regexm(var, "(newspaper)") makes sure that the row contains a newspaper, but I need a command to make sure characters after news are not "paper" as I'm trying to quantify the two.




EDIT:



Is there a way to count the third entry as 1, because it has a news occurrence without however being a newspaper?










share|improve this question
















I have a column as a string with no spaces:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
end


I am using the following command:



gen = regex,(var, "(news)")


This outputs 1 1 1 because it finds that the 3 rows in the column var contain the word news.



I'm trying to alter the regular expression "(news)" to create two columns. One for news and one for newspaper. regexm(var, "(newspaper)") makes sure that the row contains a newspaper, but I need a command to make sure characters after news are not "paper" as I'm trying to quantify the two.




EDIT:



Is there a way to count the third entry as 1, because it has a news occurrence without however being a newspaper?







regex stata






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 7 at 20:24









Pearly Spencer

12k173968




12k173968










asked Mar 7 at 10:18









sammyramzsammyramz

13619




13619







  • 1





    The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

    – Nick Cox
    Mar 7 at 13:27












  • 1





    The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

    – Nick Cox
    Mar 7 at 13:27







1




1





The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

– Nick Cox
Mar 7 at 13:27





The "following command" is illegal. Few users of Stata will not realise that, but it's not good practice to give illegal commands as examples.

– Nick Cox
Mar 7 at 13:27












1 Answer
1






active

oldest

votes


















3














You can quantify as follows without a regular expression:



clear
input str100 var
"ihaveanewspaper"
"watchingthenewsonthetv"
"watchthenewsandreadthenewspaper"
"fdgdnews"
"fgogodigjhoigjnewspaper"
"fgeogeionnewsfgdgfpaper"
"45pap9358newsfjfgni"
end

generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

list, separator(0)

+----------------------------------------+
| var news |
|----------------------------------------|
1. | ihaveanewspaper 0 |
2. | watchingthenewsonthetv 1 |
3. | watchthenewsandreadthenewspaper 0 |
4. | fdgdnews 1 |
5. | fgogodigjhoigjnewspaper 0 |
6. | fgeogeionnewsfgdgfpaper 1 |
7. | 45pap9358newsfjfgni 1 |
+----------------------------------------+

count if news
4

count if !news
3



EDIT:



One way to do this is to eliminate all instances of the word newspaper and repeat the process:



generate var2 = subinstr(var, "newspaper", "", .)
replace news = 1 if strmatch(var2, "*news*")

list, separator(0)

+------------------------------------------------------------------+
| var news var2 |
|------------------------------------------------------------------|
1. | ihaveanewspaper 0 ihavea |
2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
4. | fdgdnews 1 fdgdnews |
5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
+------------------------------------------------------------------+

count if news
5

count if !news
2





share|improve this answer
























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55041300%2fhow-to-detect-specific-subwords-in-text%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3














    You can quantify as follows without a regular expression:



    clear
    input str100 var
    "ihaveanewspaper"
    "watchingthenewsonthetv"
    "watchthenewsandreadthenewspaper"
    "fdgdnews"
    "fgogodigjhoigjnewspaper"
    "fgeogeionnewsfgdgfpaper"
    "45pap9358newsfjfgni"
    end

    generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

    list, separator(0)

    +----------------------------------------+
    | var news |
    |----------------------------------------|
    1. | ihaveanewspaper 0 |
    2. | watchingthenewsonthetv 1 |
    3. | watchthenewsandreadthenewspaper 0 |
    4. | fdgdnews 1 |
    5. | fgogodigjhoigjnewspaper 0 |
    6. | fgeogeionnewsfgdgfpaper 1 |
    7. | 45pap9358newsfjfgni 1 |
    +----------------------------------------+

    count if news
    4

    count if !news
    3



    EDIT:



    One way to do this is to eliminate all instances of the word newspaper and repeat the process:



    generate var2 = subinstr(var, "newspaper", "", .)
    replace news = 1 if strmatch(var2, "*news*")

    list, separator(0)

    +------------------------------------------------------------------+
    | var news var2 |
    |------------------------------------------------------------------|
    1. | ihaveanewspaper 0 ihavea |
    2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
    3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
    4. | fdgdnews 1 fdgdnews |
    5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
    6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
    7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
    +------------------------------------------------------------------+

    count if news
    5

    count if !news
    2





    share|improve this answer





























      3














      You can quantify as follows without a regular expression:



      clear
      input str100 var
      "ihaveanewspaper"
      "watchingthenewsonthetv"
      "watchthenewsandreadthenewspaper"
      "fdgdnews"
      "fgogodigjhoigjnewspaper"
      "fgeogeionnewsfgdgfpaper"
      "45pap9358newsfjfgni"
      end

      generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

      list, separator(0)

      +----------------------------------------+
      | var news |
      |----------------------------------------|
      1. | ihaveanewspaper 0 |
      2. | watchingthenewsonthetv 1 |
      3. | watchthenewsandreadthenewspaper 0 |
      4. | fdgdnews 1 |
      5. | fgogodigjhoigjnewspaper 0 |
      6. | fgeogeionnewsfgdgfpaper 1 |
      7. | 45pap9358newsfjfgni 1 |
      +----------------------------------------+

      count if news
      4

      count if !news
      3



      EDIT:



      One way to do this is to eliminate all instances of the word newspaper and repeat the process:



      generate var2 = subinstr(var, "newspaper", "", .)
      replace news = 1 if strmatch(var2, "*news*")

      list, separator(0)

      +------------------------------------------------------------------+
      | var news var2 |
      |------------------------------------------------------------------|
      1. | ihaveanewspaper 0 ihavea |
      2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
      3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
      4. | fdgdnews 1 fdgdnews |
      5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
      6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
      7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
      +------------------------------------------------------------------+

      count if news
      5

      count if !news
      2





      share|improve this answer



























        3












        3








        3







        You can quantify as follows without a regular expression:



        clear
        input str100 var
        "ihaveanewspaper"
        "watchingthenewsonthetv"
        "watchthenewsandreadthenewspaper"
        "fdgdnews"
        "fgogodigjhoigjnewspaper"
        "fgeogeionnewsfgdgfpaper"
        "45pap9358newsfjfgni"
        end

        generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

        list, separator(0)

        +----------------------------------------+
        | var news |
        |----------------------------------------|
        1. | ihaveanewspaper 0 |
        2. | watchingthenewsonthetv 1 |
        3. | watchthenewsandreadthenewspaper 0 |
        4. | fdgdnews 1 |
        5. | fgogodigjhoigjnewspaper 0 |
        6. | fgeogeionnewsfgdgfpaper 1 |
        7. | 45pap9358newsfjfgni 1 |
        +----------------------------------------+

        count if news
        4

        count if !news
        3



        EDIT:



        One way to do this is to eliminate all instances of the word newspaper and repeat the process:



        generate var2 = subinstr(var, "newspaper", "", .)
        replace news = 1 if strmatch(var2, "*news*")

        list, separator(0)

        +------------------------------------------------------------------+
        | var news var2 |
        |------------------------------------------------------------------|
        1. | ihaveanewspaper 0 ihavea |
        2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
        3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
        4. | fdgdnews 1 fdgdnews |
        5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
        6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
        7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
        +------------------------------------------------------------------+

        count if news
        5

        count if !news
        2





        share|improve this answer















        You can quantify as follows without a regular expression:



        clear
        input str100 var
        "ihaveanewspaper"
        "watchingthenewsonthetv"
        "watchthenewsandreadthenewspaper"
        "fdgdnews"
        "fgogodigjhoigjnewspaper"
        "fgeogeionnewsfgdgfpaper"
        "45pap9358newsfjfgni"
        end

        generate news = strmatch(var, "*news*") & !strmatch(var, "*newspaper*")

        list, separator(0)

        +----------------------------------------+
        | var news |
        |----------------------------------------|
        1. | ihaveanewspaper 0 |
        2. | watchingthenewsonthetv 1 |
        3. | watchthenewsandreadthenewspaper 0 |
        4. | fdgdnews 1 |
        5. | fgogodigjhoigjnewspaper 0 |
        6. | fgeogeionnewsfgdgfpaper 1 |
        7. | 45pap9358newsfjfgni 1 |
        +----------------------------------------+

        count if news
        4

        count if !news
        3



        EDIT:



        One way to do this is to eliminate all instances of the word newspaper and repeat the process:



        generate var2 = subinstr(var, "newspaper", "", .)
        replace news = 1 if strmatch(var2, "*news*")

        list, separator(0)

        +------------------------------------------------------------------+
        | var news var2 |
        |------------------------------------------------------------------|
        1. | ihaveanewspaper 0 ihavea |
        2. | watchingthenewsonthetv 1 watchingthenewsonthetv |
        3. | watchthenewsandreadthenewspaper 1 watchthenewsandreadthe |
        4. | fdgdnews 1 fdgdnews |
        5. | fgogodigjhoigjnewspaper 0 fgogodigjhoigj |
        6. | fgeogeionnewsfgdgfpaper 1 fgeogeionnewsfgdgfpaper |
        7. | 45pap9358newsfjfgni 1 45pap9358newsfjfgni |
        +------------------------------------------------------------------+

        count if news
        5

        count if !news
        2






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 7 at 20:22

























        answered Mar 7 at 10:50









        Pearly SpencerPearly Spencer

        12k173968




        12k173968





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55041300%2fhow-to-detect-specific-subwords-in-text%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            1928 у кіно

            Захаров Федір Захарович

            Ель Греко