Subset based on most frequent value Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!How do I sort a dictionary by value?Why can't Python parse this JSON data?Peak detection in a 2D arrayHow to access environment variable values?What is the most efficient way to loop through dataframes with pandas?Most elegant way to check if the string is empty in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasFillna with most frequent if most frequent occurs else fillna with most frequent value of the entire columnHow to pivot a dataframe
Can a Beast Master ranger change beast companions?
Sentence with dass with three Verbs (One modal and two connected with zu)
Putting class ranking in CV, but against dept guidelines
Does the Mueller report show a conspiracy between Russia and the Trump Campaign?
macOS: Name for app shortcut screen found by pinching with thumb and three fingers
A letter with no particular backstory
What is an "asse" in Elizabethan English?
How were pictures turned from film to a big picture in a picture frame before digital scanning?
How could we fake a moon landing now?
Why do early math courses focus on the cross sections of a cone and not on other 3D objects?
Lagrange four-squares theorem --- deterministic complexity
Amount of permutations on an NxNxN Rubik's Cube
An adverb for when you're not exaggerating
What initially awakened the Balrog?
What is the difference between a "ranged attack" and a "ranged weapon attack"?
What does Turing mean by this statement?
Converted a Scalar function to a TVF function for parallel execution-Still running in Serial mode
What makes a man succeed?
Random body shuffle every night—can we still function?
What are the discoveries that have been possible with the rejection of positivism?
Misunderstanding of Sylow theory
Crossing US/Canada Border for less than 24 hours
What to do with repeated rejections for phd position
Induction Proof for Sequences
Subset based on most frequent value
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!How do I sort a dictionary by value?Why can't Python parse this JSON data?Peak detection in a 2D arrayHow to access environment variable values?What is the most efficient way to loop through dataframes with pandas?Most elegant way to check if the string is empty in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasFillna with most frequent if most frequent occurs else fillna with most frequent value of the entire columnHow to pivot a dataframe
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
Say I have the below dataset as a CSV file.
I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.
In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.
A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2
In this example, I want my output to be a CSV file of:
A B
! 1
@ 1
# 1
$ 1
But since the most frequent value will change, I'm not sure what my code should be.
Any help you can provide will be much appreciated. Thank you.
python pandas pandas-groupby
add a comment |
Say I have the below dataset as a CSV file.
I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.
In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.
A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2
In this example, I want my output to be a CSV file of:
A B
! 1
@ 1
# 1
$ 1
But since the most frequent value will change, I'm not sure what my code should be.
Any help you can provide will be much appreciated. Thank you.
python pandas pandas-groupby
Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?
– 4Oh4
Mar 8 at 22:46
add a comment |
Say I have the below dataset as a CSV file.
I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.
In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.
A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2
In this example, I want my output to be a CSV file of:
A B
! 1
@ 1
# 1
$ 1
But since the most frequent value will change, I'm not sure what my code should be.
Any help you can provide will be much appreciated. Thank you.
python pandas pandas-groupby
Say I have the below dataset as a CSV file.
I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.
In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.
A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2
In this example, I want my output to be a CSV file of:
A B
! 1
@ 1
# 1
$ 1
But since the most frequent value will change, I'm not sure what my code should be.
Any help you can provide will be much appreciated. Thank you.
python pandas pandas-groupby
python pandas pandas-groupby
edited Mar 8 at 22:00
petezurich
3,89081936
3,89081936
asked Mar 8 at 21:44
Amie JohnsonAmie Johnson
254
254
Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?
– 4Oh4
Mar 8 at 22:46
add a comment |
Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?
– 4Oh4
Mar 8 at 22:46
Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?
– 4Oh4
Mar 8 at 22:46
Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?
– 4Oh4
Mar 8 at 22:46
add a comment |
1 Answer
1
active
oldest
votes
We can use mode to return the value that appears most often and then filter on that value:
df[df['B']==df['B'].mode()[0]]
Output:
A B
0 ! 1
1 @ 1
2 # 1
3 $ 1
And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :
df[df['B']==df['B'].value_counts().index[N]]
e.g. for N=1:
df[df['B']==df['B'].value_counts().index[1]]
Output:
A B
4 % 2
6 & 2
9 ) 2
1
Thank you so much!! You are the best. I'm a new python user and am still learning.
– Amie Johnson
Mar 8 at 21:51
What if I wanted the 2nd or 3rd most frequent values?
– Amie Johnson
Mar 8 at 21:52
Updated my answer with how to filter on the Nth most frequent
– perl
Mar 8 at 21:59
Amazing, thank you again so much! Have a great weekend!
– Amie Johnson
Mar 8 at 22:07
1
@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!
– TemporalWolf
Mar 8 at 22:23
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55071427%2fsubset-based-on-most-frequent-value%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
We can use mode to return the value that appears most often and then filter on that value:
df[df['B']==df['B'].mode()[0]]
Output:
A B
0 ! 1
1 @ 1
2 # 1
3 $ 1
And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :
df[df['B']==df['B'].value_counts().index[N]]
e.g. for N=1:
df[df['B']==df['B'].value_counts().index[1]]
Output:
A B
4 % 2
6 & 2
9 ) 2
1
Thank you so much!! You are the best. I'm a new python user and am still learning.
– Amie Johnson
Mar 8 at 21:51
What if I wanted the 2nd or 3rd most frequent values?
– Amie Johnson
Mar 8 at 21:52
Updated my answer with how to filter on the Nth most frequent
– perl
Mar 8 at 21:59
Amazing, thank you again so much! Have a great weekend!
– Amie Johnson
Mar 8 at 22:07
1
@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!
– TemporalWolf
Mar 8 at 22:23
|
show 1 more comment
We can use mode to return the value that appears most often and then filter on that value:
df[df['B']==df['B'].mode()[0]]
Output:
A B
0 ! 1
1 @ 1
2 # 1
3 $ 1
And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :
df[df['B']==df['B'].value_counts().index[N]]
e.g. for N=1:
df[df['B']==df['B'].value_counts().index[1]]
Output:
A B
4 % 2
6 & 2
9 ) 2
1
Thank you so much!! You are the best. I'm a new python user and am still learning.
– Amie Johnson
Mar 8 at 21:51
What if I wanted the 2nd or 3rd most frequent values?
– Amie Johnson
Mar 8 at 21:52
Updated my answer with how to filter on the Nth most frequent
– perl
Mar 8 at 21:59
Amazing, thank you again so much! Have a great weekend!
– Amie Johnson
Mar 8 at 22:07
1
@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!
– TemporalWolf
Mar 8 at 22:23
|
show 1 more comment
We can use mode to return the value that appears most often and then filter on that value:
df[df['B']==df['B'].mode()[0]]
Output:
A B
0 ! 1
1 @ 1
2 # 1
3 $ 1
And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :
df[df['B']==df['B'].value_counts().index[N]]
e.g. for N=1:
df[df['B']==df['B'].value_counts().index[1]]
Output:
A B
4 % 2
6 & 2
9 ) 2
We can use mode to return the value that appears most often and then filter on that value:
df[df['B']==df['B'].mode()[0]]
Output:
A B
0 ! 1
1 @ 1
2 # 1
3 $ 1
And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :
df[df['B']==df['B'].value_counts().index[N]]
e.g. for N=1:
df[df['B']==df['B'].value_counts().index[1]]
Output:
A B
4 % 2
6 & 2
9 ) 2
edited Mar 8 at 22:06
answered Mar 8 at 21:48
perlperl
1,908416
1,908416
1
Thank you so much!! You are the best. I'm a new python user and am still learning.
– Amie Johnson
Mar 8 at 21:51
What if I wanted the 2nd or 3rd most frequent values?
– Amie Johnson
Mar 8 at 21:52
Updated my answer with how to filter on the Nth most frequent
– perl
Mar 8 at 21:59
Amazing, thank you again so much! Have a great weekend!
– Amie Johnson
Mar 8 at 22:07
1
@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!
– TemporalWolf
Mar 8 at 22:23
|
show 1 more comment
1
Thank you so much!! You are the best. I'm a new python user and am still learning.
– Amie Johnson
Mar 8 at 21:51
What if I wanted the 2nd or 3rd most frequent values?
– Amie Johnson
Mar 8 at 21:52
Updated my answer with how to filter on the Nth most frequent
– perl
Mar 8 at 21:59
Amazing, thank you again so much! Have a great weekend!
– Amie Johnson
Mar 8 at 22:07
1
@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!
– TemporalWolf
Mar 8 at 22:23
1
1
Thank you so much!! You are the best. I'm a new python user and am still learning.
– Amie Johnson
Mar 8 at 21:51
Thank you so much!! You are the best. I'm a new python user and am still learning.
– Amie Johnson
Mar 8 at 21:51
What if I wanted the 2nd or 3rd most frequent values?
– Amie Johnson
Mar 8 at 21:52
What if I wanted the 2nd or 3rd most frequent values?
– Amie Johnson
Mar 8 at 21:52
Updated my answer with how to filter on the Nth most frequent
– perl
Mar 8 at 21:59
Updated my answer with how to filter on the Nth most frequent
– perl
Mar 8 at 21:59
Amazing, thank you again so much! Have a great weekend!
– Amie Johnson
Mar 8 at 22:07
Amazing, thank you again so much! Have a great weekend!
– Amie Johnson
Mar 8 at 22:07
1
1
@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!
– TemporalWolf
Mar 8 at 22:23
@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!
– TemporalWolf
Mar 8 at 22:23
|
show 1 more comment
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55071427%2fsubset-based-on-most-frequent-value%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?
– 4Oh4
Mar 8 at 22:46