Identify and group across observations2019 Community Moderator ElectionGrouping functions (tapply, by, aggregate) and the *apply familyhow to uniqely identify the observations in the group of variables?data.table vs dplyr: can one do something well the other can't or does poorly?Sum across multiple columns with dplyrFunction similar to group_by when groups are not mutually exlcusiveHow to group by all variables except some and add a group id to every observationAggregating if each observation can belong to multiple groupsSum of previous observations by group in a vectorTake difference between observations within same group with a reference observationCreating a logical variable to identify the row within a group that is the minimum difference between two date-times

In Aliens, how many people were on LV-426 before the Marines arrived​?

Are dual Irish/British citizens bound by the 90/180 day rule when travelling in the EU after Brexit?

Wrapping homogeneous Python objects

What does Deadpool mean by "left the house in that shirt"?

Probably overheated black color SMD pads

Maths symbols and unicode-math input inside siunitx commands

Do I need to consider instance restrictions when showing a language is in P?

Generic TVP tradeoffs?

Does .bashrc contain syntax errors?

Print a physical multiplication table

Am I eligible for the Eurail Youth pass? I am 27.5 years old

How is the partial sum of a geometric sequence calculated?

Describing a chess game in a novel

How are passwords stolen from companies if they only store hashes?

Have the tides ever turned twice on any open problem?

What can I do if I am asked to learn different programming languages very frequently?

Why didn't Héctor fade away after this character died in the movie Coco?

Usage and meaning of "up" in "...worth at least a thousand pounds up in London"

Knife as defense against stray dogs

Can you move over difficult terrain with only 5 feet of movement?

Suggestions on how to spend Shaabath (constructively) alone

How to generate binary array whose elements with values 1 are randomly drawn

Writing in a Christian voice

Hausdorff dimension of the boundary of fibres of Lipschitz maps



Identify and group across observations



2019 Community Moderator ElectionGrouping functions (tapply, by, aggregate) and the *apply familyhow to uniqely identify the observations in the group of variables?data.table vs dplyr: can one do something well the other can't or does poorly?Sum across multiple columns with dplyrFunction similar to group_by when groups are not mutually exlcusiveHow to group by all variables except some and add a group id to every observationAggregating if each observation can belong to multiple groupsSum of previous observations by group in a vectorTake difference between observations within same group with a reference observationCreating a logical variable to identify the row within a group that is the minimum difference between two date-times










1















How can I identify and generate a new variable that identifies which observations belong to different groups. Say I have the following dataset:



ID | country | side 
1 | arg | 1
1 | usa | 0
2 | ita | 1
2 | usa | 0
2 | uk | 1
3 | aus | 0
3 | uk | 1


and I want to create a new variable (sideUK) that identifies whether country "uk" was involved in ID and side of each country. So for example this would be:



ID | country | side | sideuk
1 | arg | 1 | 0
1 | usa | 0 | 0
2 | ita | 1 | 1
2 | usa | 0 | 0
2 | uk | 1 | 1
3 | aus | 0 | 0
3 | uk | 1 | 1









share|improve this question


























    1















    How can I identify and generate a new variable that identifies which observations belong to different groups. Say I have the following dataset:



    ID | country | side 
    1 | arg | 1
    1 | usa | 0
    2 | ita | 1
    2 | usa | 0
    2 | uk | 1
    3 | aus | 0
    3 | uk | 1


    and I want to create a new variable (sideUK) that identifies whether country "uk" was involved in ID and side of each country. So for example this would be:



    ID | country | side | sideuk
    1 | arg | 1 | 0
    1 | usa | 0 | 0
    2 | ita | 1 | 1
    2 | usa | 0 | 0
    2 | uk | 1 | 1
    3 | aus | 0 | 0
    3 | uk | 1 | 1









    share|improve this question
























      1












      1








      1








      How can I identify and generate a new variable that identifies which observations belong to different groups. Say I have the following dataset:



      ID | country | side 
      1 | arg | 1
      1 | usa | 0
      2 | ita | 1
      2 | usa | 0
      2 | uk | 1
      3 | aus | 0
      3 | uk | 1


      and I want to create a new variable (sideUK) that identifies whether country "uk" was involved in ID and side of each country. So for example this would be:



      ID | country | side | sideuk
      1 | arg | 1 | 0
      1 | usa | 0 | 0
      2 | ita | 1 | 1
      2 | usa | 0 | 0
      2 | uk | 1 | 1
      3 | aus | 0 | 0
      3 | uk | 1 | 1









      share|improve this question














      How can I identify and generate a new variable that identifies which observations belong to different groups. Say I have the following dataset:



      ID | country | side 
      1 | arg | 1
      1 | usa | 0
      2 | ita | 1
      2 | usa | 0
      2 | uk | 1
      3 | aus | 0
      3 | uk | 1


      and I want to create a new variable (sideUK) that identifies whether country "uk" was involved in ID and side of each country. So for example this would be:



      ID | country | side | sideuk
      1 | arg | 1 | 0
      1 | usa | 0 | 0
      2 | ita | 1 | 1
      2 | usa | 0 | 0
      2 | uk | 1 | 1
      3 | aus | 0 | 0
      3 | uk | 1 | 1






      r dplyr






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 6 at 22:04









      Agustín IndacoAgustín Indaco

      323315




      323315






















          3 Answers
          3






          active

          oldest

          votes


















          2














          I'm not entirely sure what you're after, but the following reproduces your expected output



          library(dplyr)
          df %>%
          group_by(ID) %>%
          mutate(sideuk = +("uk" %in% country & side == 1)) %>%
          ungroup()
          ## A tibble: 7 x 4
          # ID country side sideuk
          # <int> <fct> <int> <int>
          #1 1 arg 1 0
          #2 1 usa 0 0
          #3 2 ita 1 1
          #4 2 usa 0 0
          #5 2 uk 1 1
          #6 3 aus 0 0
          #7 3 uk 1 1



          Sample data



          df <- read.table(text =
          "ID country side
          1 arg 1
          1 usa 0
          2 ita 1
          2 usa 0
          2 uk 1
          3 aus 0
          3 uk 1", header = T)





          share|improve this answer






























            2














            You want to group by ID and then check for 'uk' in the country variable



            df %>%
            group_by(ID, side) %>%
            mutate(sideuk = as.integer('uk' %in% country))

            # A tibble: 7 x 4
            # Groups: ID, side [6]
            ID country side sideuk
            <dbl> <fct> <dbl> <int>
            1 1 arg 1 0
            2 1 usa 0 0
            3 2 ita 1 1
            4 2 usa 0 0
            5 2 uk 1 1
            6 3 aus 0 0
            7 3 uk 1 1





            share|improve this answer






























              0














              I am not sure if this is what you are looking for. It is a solution without external libraries:



              df$sideuk <- apply(df, 1, function(row) 
              return(
              as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
              )
              )


              Returns:



               ID country side sideuk
              1 1 arg 1 0
              2 1 usa 0 0
              3 2 ita 1 1
              4 2 usa 0 0
              5 2 uk 1 1
              6 3 aus 0 0
              7 3 uk 1 1
              8 4 mx 1 0
              9 4 uk 0 0



              Sample data



              df <- read.table(text =
              "ID country side
              1 arg 1
              1 usa 0
              2 ita 1
              2 usa 0
              2 uk 1
              3 aus 0
              3 uk 1
              4 mx 1
              4 uk 0", header = T)





              share|improve this answer






















                Your Answer






                StackExchange.ifUsing("editor", function ()
                StackExchange.using("externalEditor", function ()
                StackExchange.using("snippets", function ()
                StackExchange.snippets.init();
                );
                );
                , "code-snippets");

                StackExchange.ready(function()
                var channelOptions =
                tags: "".split(" "),
                id: "1"
                ;
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function()
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled)
                StackExchange.using("snippets", function()
                createEditor();
                );

                else
                createEditor();

                );

                function createEditor()
                StackExchange.prepareEditor(
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader:
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                ,
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                );



                );













                draft saved

                draft discarded


















                StackExchange.ready(
                function ()
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55032890%2fidentify-and-group-across-observations%23new-answer', 'question_page');

                );

                Post as a guest















                Required, but never shown

























                3 Answers
                3






                active

                oldest

                votes








                3 Answers
                3






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                2














                I'm not entirely sure what you're after, but the following reproduces your expected output



                library(dplyr)
                df %>%
                group_by(ID) %>%
                mutate(sideuk = +("uk" %in% country & side == 1)) %>%
                ungroup()
                ## A tibble: 7 x 4
                # ID country side sideuk
                # <int> <fct> <int> <int>
                #1 1 arg 1 0
                #2 1 usa 0 0
                #3 2 ita 1 1
                #4 2 usa 0 0
                #5 2 uk 1 1
                #6 3 aus 0 0
                #7 3 uk 1 1



                Sample data



                df <- read.table(text =
                "ID country side
                1 arg 1
                1 usa 0
                2 ita 1
                2 usa 0
                2 uk 1
                3 aus 0
                3 uk 1", header = T)





                share|improve this answer



























                  2














                  I'm not entirely sure what you're after, but the following reproduces your expected output



                  library(dplyr)
                  df %>%
                  group_by(ID) %>%
                  mutate(sideuk = +("uk" %in% country & side == 1)) %>%
                  ungroup()
                  ## A tibble: 7 x 4
                  # ID country side sideuk
                  # <int> <fct> <int> <int>
                  #1 1 arg 1 0
                  #2 1 usa 0 0
                  #3 2 ita 1 1
                  #4 2 usa 0 0
                  #5 2 uk 1 1
                  #6 3 aus 0 0
                  #7 3 uk 1 1



                  Sample data



                  df <- read.table(text =
                  "ID country side
                  1 arg 1
                  1 usa 0
                  2 ita 1
                  2 usa 0
                  2 uk 1
                  3 aus 0
                  3 uk 1", header = T)





                  share|improve this answer

























                    2












                    2








                    2







                    I'm not entirely sure what you're after, but the following reproduces your expected output



                    library(dplyr)
                    df %>%
                    group_by(ID) %>%
                    mutate(sideuk = +("uk" %in% country & side == 1)) %>%
                    ungroup()
                    ## A tibble: 7 x 4
                    # ID country side sideuk
                    # <int> <fct> <int> <int>
                    #1 1 arg 1 0
                    #2 1 usa 0 0
                    #3 2 ita 1 1
                    #4 2 usa 0 0
                    #5 2 uk 1 1
                    #6 3 aus 0 0
                    #7 3 uk 1 1



                    Sample data



                    df <- read.table(text =
                    "ID country side
                    1 arg 1
                    1 usa 0
                    2 ita 1
                    2 usa 0
                    2 uk 1
                    3 aus 0
                    3 uk 1", header = T)





                    share|improve this answer













                    I'm not entirely sure what you're after, but the following reproduces your expected output



                    library(dplyr)
                    df %>%
                    group_by(ID) %>%
                    mutate(sideuk = +("uk" %in% country & side == 1)) %>%
                    ungroup()
                    ## A tibble: 7 x 4
                    # ID country side sideuk
                    # <int> <fct> <int> <int>
                    #1 1 arg 1 0
                    #2 1 usa 0 0
                    #3 2 ita 1 1
                    #4 2 usa 0 0
                    #5 2 uk 1 1
                    #6 3 aus 0 0
                    #7 3 uk 1 1



                    Sample data



                    df <- read.table(text =
                    "ID country side
                    1 arg 1
                    1 usa 0
                    2 ita 1
                    2 usa 0
                    2 uk 1
                    3 aus 0
                    3 uk 1", header = T)






                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Mar 6 at 22:10









                    Maurits EversMaurits Evers

                    29.5k41535




                    29.5k41535























                        2














                        You want to group by ID and then check for 'uk' in the country variable



                        df %>%
                        group_by(ID, side) %>%
                        mutate(sideuk = as.integer('uk' %in% country))

                        # A tibble: 7 x 4
                        # Groups: ID, side [6]
                        ID country side sideuk
                        <dbl> <fct> <dbl> <int>
                        1 1 arg 1 0
                        2 1 usa 0 0
                        3 2 ita 1 1
                        4 2 usa 0 0
                        5 2 uk 1 1
                        6 3 aus 0 0
                        7 3 uk 1 1





                        share|improve this answer



























                          2














                          You want to group by ID and then check for 'uk' in the country variable



                          df %>%
                          group_by(ID, side) %>%
                          mutate(sideuk = as.integer('uk' %in% country))

                          # A tibble: 7 x 4
                          # Groups: ID, side [6]
                          ID country side sideuk
                          <dbl> <fct> <dbl> <int>
                          1 1 arg 1 0
                          2 1 usa 0 0
                          3 2 ita 1 1
                          4 2 usa 0 0
                          5 2 uk 1 1
                          6 3 aus 0 0
                          7 3 uk 1 1





                          share|improve this answer

























                            2












                            2








                            2







                            You want to group by ID and then check for 'uk' in the country variable



                            df %>%
                            group_by(ID, side) %>%
                            mutate(sideuk = as.integer('uk' %in% country))

                            # A tibble: 7 x 4
                            # Groups: ID, side [6]
                            ID country side sideuk
                            <dbl> <fct> <dbl> <int>
                            1 1 arg 1 0
                            2 1 usa 0 0
                            3 2 ita 1 1
                            4 2 usa 0 0
                            5 2 uk 1 1
                            6 3 aus 0 0
                            7 3 uk 1 1





                            share|improve this answer













                            You want to group by ID and then check for 'uk' in the country variable



                            df %>%
                            group_by(ID, side) %>%
                            mutate(sideuk = as.integer('uk' %in% country))

                            # A tibble: 7 x 4
                            # Groups: ID, side [6]
                            ID country side sideuk
                            <dbl> <fct> <dbl> <int>
                            1 1 arg 1 0
                            2 1 usa 0 0
                            3 2 ita 1 1
                            4 2 usa 0 0
                            5 2 uk 1 1
                            6 3 aus 0 0
                            7 3 uk 1 1






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Mar 6 at 22:12









                            divibisandivibisan

                            4,89681833




                            4,89681833





















                                0














                                I am not sure if this is what you are looking for. It is a solution without external libraries:



                                df$sideuk <- apply(df, 1, function(row) 
                                return(
                                as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
                                )
                                )


                                Returns:



                                 ID country side sideuk
                                1 1 arg 1 0
                                2 1 usa 0 0
                                3 2 ita 1 1
                                4 2 usa 0 0
                                5 2 uk 1 1
                                6 3 aus 0 0
                                7 3 uk 1 1
                                8 4 mx 1 0
                                9 4 uk 0 0



                                Sample data



                                df <- read.table(text =
                                "ID country side
                                1 arg 1
                                1 usa 0
                                2 ita 1
                                2 usa 0
                                2 uk 1
                                3 aus 0
                                3 uk 1
                                4 mx 1
                                4 uk 0", header = T)





                                share|improve this answer



























                                  0














                                  I am not sure if this is what you are looking for. It is a solution without external libraries:



                                  df$sideuk <- apply(df, 1, function(row) 
                                  return(
                                  as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
                                  )
                                  )


                                  Returns:



                                   ID country side sideuk
                                  1 1 arg 1 0
                                  2 1 usa 0 0
                                  3 2 ita 1 1
                                  4 2 usa 0 0
                                  5 2 uk 1 1
                                  6 3 aus 0 0
                                  7 3 uk 1 1
                                  8 4 mx 1 0
                                  9 4 uk 0 0



                                  Sample data



                                  df <- read.table(text =
                                  "ID country side
                                  1 arg 1
                                  1 usa 0
                                  2 ita 1
                                  2 usa 0
                                  2 uk 1
                                  3 aus 0
                                  3 uk 1
                                  4 mx 1
                                  4 uk 0", header = T)





                                  share|improve this answer

























                                    0












                                    0








                                    0







                                    I am not sure if this is what you are looking for. It is a solution without external libraries:



                                    df$sideuk <- apply(df, 1, function(row) 
                                    return(
                                    as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
                                    )
                                    )


                                    Returns:



                                     ID country side sideuk
                                    1 1 arg 1 0
                                    2 1 usa 0 0
                                    3 2 ita 1 1
                                    4 2 usa 0 0
                                    5 2 uk 1 1
                                    6 3 aus 0 0
                                    7 3 uk 1 1
                                    8 4 mx 1 0
                                    9 4 uk 0 0



                                    Sample data



                                    df <- read.table(text =
                                    "ID country side
                                    1 arg 1
                                    1 usa 0
                                    2 ita 1
                                    2 usa 0
                                    2 uk 1
                                    3 aus 0
                                    3 uk 1
                                    4 mx 1
                                    4 uk 0", header = T)





                                    share|improve this answer













                                    I am not sure if this is what you are looking for. It is a solution without external libraries:



                                    df$sideuk <- apply(df, 1, function(row) 
                                    return(
                                    as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
                                    )
                                    )


                                    Returns:



                                     ID country side sideuk
                                    1 1 arg 1 0
                                    2 1 usa 0 0
                                    3 2 ita 1 1
                                    4 2 usa 0 0
                                    5 2 uk 1 1
                                    6 3 aus 0 0
                                    7 3 uk 1 1
                                    8 4 mx 1 0
                                    9 4 uk 0 0



                                    Sample data



                                    df <- read.table(text =
                                    "ID country side
                                    1 arg 1
                                    1 usa 0
                                    2 ita 1
                                    2 usa 0
                                    2 uk 1
                                    3 aus 0
                                    3 uk 1
                                    4 mx 1
                                    4 uk 0", header = T)






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Mar 6 at 22:45









                                    mayropmayrop

                                    501512




                                    501512



























                                        draft saved

                                        draft discarded
















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid


                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.

                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function ()
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55032890%2fidentify-and-group-across-observations%23new-answer', 'question_page');

                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        1928 у кіно

                                        Захаров Федір Захарович

                                        Ель Греко