Identify and group across observations2019 Community Moderator ElectionGrouping functions (tapply, by, aggregate) and the *apply familyhow to uniqely identify the observations in the group of variables?data.table vs dplyr: can one do something well the other can't or does poorly?Sum across multiple columns with dplyrFunction similar to group_by when groups are not mutually exlcusiveHow to group by all variables except some and add a group id to every observationAggregating if each observation can belong to multiple groupsSum of previous observations by group in a vectorTake difference between observations within same group with a reference observationCreating a logical variable to identify the row within a group that is the minimum difference between two date-times
In Aliens, how many people were on LV-426 before the Marines arrived?
Are dual Irish/British citizens bound by the 90/180 day rule when travelling in the EU after Brexit?
Wrapping homogeneous Python objects
What does Deadpool mean by "left the house in that shirt"?
Probably overheated black color SMD pads
Maths symbols and unicode-math input inside siunitx commands
Do I need to consider instance restrictions when showing a language is in P?
Generic TVP tradeoffs?
Does .bashrc contain syntax errors?
Print a physical multiplication table
Am I eligible for the Eurail Youth pass? I am 27.5 years old
How is the partial sum of a geometric sequence calculated?
Describing a chess game in a novel
How are passwords stolen from companies if they only store hashes?
Have the tides ever turned twice on any open problem?
What can I do if I am asked to learn different programming languages very frequently?
Why didn't Héctor fade away after this character died in the movie Coco?
Usage and meaning of "up" in "...worth at least a thousand pounds up in London"
Knife as defense against stray dogs
Can you move over difficult terrain with only 5 feet of movement?
Suggestions on how to spend Shaabath (constructively) alone
How to generate binary array whose elements with values 1 are randomly drawn
Writing in a Christian voice
Hausdorff dimension of the boundary of fibres of Lipschitz maps
Identify and group across observations
2019 Community Moderator ElectionGrouping functions (tapply, by, aggregate) and the *apply familyhow to uniqely identify the observations in the group of variables?data.table vs dplyr: can one do something well the other can't or does poorly?Sum across multiple columns with dplyrFunction similar to group_by when groups are not mutually exlcusiveHow to group by all variables except some and add a group id to every observationAggregating if each observation can belong to multiple groupsSum of previous observations by group in a vectorTake difference between observations within same group with a reference observationCreating a logical variable to identify the row within a group that is the minimum difference between two date-times
How can I identify and generate a new variable that identifies which observations belong to different groups. Say I have the following dataset:
ID | country | side
1 | arg | 1
1 | usa | 0
2 | ita | 1
2 | usa | 0
2 | uk | 1
3 | aus | 0
3 | uk | 1
and I want to create a new variable (sideUK) that identifies whether country "uk" was involved in ID and side of each country. So for example this would be:
ID | country | side | sideuk
1 | arg | 1 | 0
1 | usa | 0 | 0
2 | ita | 1 | 1
2 | usa | 0 | 0
2 | uk | 1 | 1
3 | aus | 0 | 0
3 | uk | 1 | 1
r dplyr
add a comment |
How can I identify and generate a new variable that identifies which observations belong to different groups. Say I have the following dataset:
ID | country | side
1 | arg | 1
1 | usa | 0
2 | ita | 1
2 | usa | 0
2 | uk | 1
3 | aus | 0
3 | uk | 1
and I want to create a new variable (sideUK) that identifies whether country "uk" was involved in ID and side of each country. So for example this would be:
ID | country | side | sideuk
1 | arg | 1 | 0
1 | usa | 0 | 0
2 | ita | 1 | 1
2 | usa | 0 | 0
2 | uk | 1 | 1
3 | aus | 0 | 0
3 | uk | 1 | 1
r dplyr
add a comment |
How can I identify and generate a new variable that identifies which observations belong to different groups. Say I have the following dataset:
ID | country | side
1 | arg | 1
1 | usa | 0
2 | ita | 1
2 | usa | 0
2 | uk | 1
3 | aus | 0
3 | uk | 1
and I want to create a new variable (sideUK) that identifies whether country "uk" was involved in ID and side of each country. So for example this would be:
ID | country | side | sideuk
1 | arg | 1 | 0
1 | usa | 0 | 0
2 | ita | 1 | 1
2 | usa | 0 | 0
2 | uk | 1 | 1
3 | aus | 0 | 0
3 | uk | 1 | 1
r dplyr
How can I identify and generate a new variable that identifies which observations belong to different groups. Say I have the following dataset:
ID | country | side
1 | arg | 1
1 | usa | 0
2 | ita | 1
2 | usa | 0
2 | uk | 1
3 | aus | 0
3 | uk | 1
and I want to create a new variable (sideUK) that identifies whether country "uk" was involved in ID and side of each country. So for example this would be:
ID | country | side | sideuk
1 | arg | 1 | 0
1 | usa | 0 | 0
2 | ita | 1 | 1
2 | usa | 0 | 0
2 | uk | 1 | 1
3 | aus | 0 | 0
3 | uk | 1 | 1
r dplyr
r dplyr
asked Mar 6 at 22:04
Agustín IndacoAgustín Indaco
323315
323315
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
I'm not entirely sure what you're after, but the following reproduces your expected output
library(dplyr)
df %>%
group_by(ID) %>%
mutate(sideuk = +("uk" %in% country & side == 1)) %>%
ungroup()
## A tibble: 7 x 4
# ID country side sideuk
# <int> <fct> <int> <int>
#1 1 arg 1 0
#2 1 usa 0 0
#3 2 ita 1 1
#4 2 usa 0 0
#5 2 uk 1 1
#6 3 aus 0 0
#7 3 uk 1 1
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1", header = T)
add a comment |
You want to group by ID and then check for 'uk'
in the country
variable
df %>%
group_by(ID, side) %>%
mutate(sideuk = as.integer('uk' %in% country))
# A tibble: 7 x 4
# Groups: ID, side [6]
ID country side sideuk
<dbl> <fct> <dbl> <int>
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
add a comment |
I am not sure if this is what you are looking for. It is a solution without external libraries:
df$sideuk <- apply(df, 1, function(row)
return(
as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
)
)
Returns:
ID country side sideuk
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
8 4 mx 1 0
9 4 uk 0 0
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1
4 mx 1
4 uk 0", header = T)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55032890%2fidentify-and-group-across-observations%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
I'm not entirely sure what you're after, but the following reproduces your expected output
library(dplyr)
df %>%
group_by(ID) %>%
mutate(sideuk = +("uk" %in% country & side == 1)) %>%
ungroup()
## A tibble: 7 x 4
# ID country side sideuk
# <int> <fct> <int> <int>
#1 1 arg 1 0
#2 1 usa 0 0
#3 2 ita 1 1
#4 2 usa 0 0
#5 2 uk 1 1
#6 3 aus 0 0
#7 3 uk 1 1
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1", header = T)
add a comment |
I'm not entirely sure what you're after, but the following reproduces your expected output
library(dplyr)
df %>%
group_by(ID) %>%
mutate(sideuk = +("uk" %in% country & side == 1)) %>%
ungroup()
## A tibble: 7 x 4
# ID country side sideuk
# <int> <fct> <int> <int>
#1 1 arg 1 0
#2 1 usa 0 0
#3 2 ita 1 1
#4 2 usa 0 0
#5 2 uk 1 1
#6 3 aus 0 0
#7 3 uk 1 1
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1", header = T)
add a comment |
I'm not entirely sure what you're after, but the following reproduces your expected output
library(dplyr)
df %>%
group_by(ID) %>%
mutate(sideuk = +("uk" %in% country & side == 1)) %>%
ungroup()
## A tibble: 7 x 4
# ID country side sideuk
# <int> <fct> <int> <int>
#1 1 arg 1 0
#2 1 usa 0 0
#3 2 ita 1 1
#4 2 usa 0 0
#5 2 uk 1 1
#6 3 aus 0 0
#7 3 uk 1 1
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1", header = T)
I'm not entirely sure what you're after, but the following reproduces your expected output
library(dplyr)
df %>%
group_by(ID) %>%
mutate(sideuk = +("uk" %in% country & side == 1)) %>%
ungroup()
## A tibble: 7 x 4
# ID country side sideuk
# <int> <fct> <int> <int>
#1 1 arg 1 0
#2 1 usa 0 0
#3 2 ita 1 1
#4 2 usa 0 0
#5 2 uk 1 1
#6 3 aus 0 0
#7 3 uk 1 1
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1", header = T)
answered Mar 6 at 22:10
Maurits EversMaurits Evers
29.5k41535
29.5k41535
add a comment |
add a comment |
You want to group by ID and then check for 'uk'
in the country
variable
df %>%
group_by(ID, side) %>%
mutate(sideuk = as.integer('uk' %in% country))
# A tibble: 7 x 4
# Groups: ID, side [6]
ID country side sideuk
<dbl> <fct> <dbl> <int>
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
add a comment |
You want to group by ID and then check for 'uk'
in the country
variable
df %>%
group_by(ID, side) %>%
mutate(sideuk = as.integer('uk' %in% country))
# A tibble: 7 x 4
# Groups: ID, side [6]
ID country side sideuk
<dbl> <fct> <dbl> <int>
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
add a comment |
You want to group by ID and then check for 'uk'
in the country
variable
df %>%
group_by(ID, side) %>%
mutate(sideuk = as.integer('uk' %in% country))
# A tibble: 7 x 4
# Groups: ID, side [6]
ID country side sideuk
<dbl> <fct> <dbl> <int>
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
You want to group by ID and then check for 'uk'
in the country
variable
df %>%
group_by(ID, side) %>%
mutate(sideuk = as.integer('uk' %in% country))
# A tibble: 7 x 4
# Groups: ID, side [6]
ID country side sideuk
<dbl> <fct> <dbl> <int>
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
answered Mar 6 at 22:12
divibisandivibisan
4,89681833
4,89681833
add a comment |
add a comment |
I am not sure if this is what you are looking for. It is a solution without external libraries:
df$sideuk <- apply(df, 1, function(row)
return(
as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
)
)
Returns:
ID country side sideuk
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
8 4 mx 1 0
9 4 uk 0 0
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1
4 mx 1
4 uk 0", header = T)
add a comment |
I am not sure if this is what you are looking for. It is a solution without external libraries:
df$sideuk <- apply(df, 1, function(row)
return(
as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
)
)
Returns:
ID country side sideuk
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
8 4 mx 1 0
9 4 uk 0 0
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1
4 mx 1
4 uk 0", header = T)
add a comment |
I am not sure if this is what you are looking for. It is a solution without external libraries:
df$sideuk <- apply(df, 1, function(row)
return(
as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
)
)
Returns:
ID country side sideuk
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
8 4 mx 1 0
9 4 uk 0 0
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1
4 mx 1
4 uk 0", header = T)
I am not sure if this is what you are looking for. It is a solution without external libraries:
df$sideuk <- apply(df, 1, function(row)
return(
as.integer(any(df[df$ID==row["ID"] & df$country=="uk" & row["side"] == 1, "side"]))
)
)
Returns:
ID country side sideuk
1 1 arg 1 0
2 1 usa 0 0
3 2 ita 1 1
4 2 usa 0 0
5 2 uk 1 1
6 3 aus 0 0
7 3 uk 1 1
8 4 mx 1 0
9 4 uk 0 0
Sample data
df <- read.table(text =
"ID country side
1 arg 1
1 usa 0
2 ita 1
2 usa 0
2 uk 1
3 aus 0
3 uk 1
4 mx 1
4 uk 0", header = T)
answered Mar 6 at 22:45
mayropmayrop
501512
501512
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55032890%2fidentify-and-group-across-observations%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown