Subset based on most frequent value Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!How do I sort a dictionary by value?Why can't Python parse this JSON data?Peak detection in a 2D arrayHow to access environment variable values?What is the most efficient way to loop through dataframes with pandas?Most elegant way to check if the string is empty in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasFillna with most frequent if most frequent occurs else fillna with most frequent value of the entire columnHow to pivot a dataframe

Can a Beast Master ranger change beast companions?

Sentence with dass with three Verbs (One modal and two connected with zu)

Putting class ranking in CV, but against dept guidelines

Does the Mueller report show a conspiracy between Russia and the Trump Campaign?

macOS: Name for app shortcut screen found by pinching with thumb and three fingers

A letter with no particular backstory

What is an "asse" in Elizabethan English?

How were pictures turned from film to a big picture in a picture frame before digital scanning?

How could we fake a moon landing now?

Why do early math courses focus on the cross sections of a cone and not on other 3D objects?

Lagrange four-squares theorem --- deterministic complexity

Amount of permutations on an NxNxN Rubik's Cube

An adverb for when you're not exaggerating

What initially awakened the Balrog?

What is the difference between a "ranged attack" and a "ranged weapon attack"?

What does Turing mean by this statement?

Converted a Scalar function to a TVF function for parallel execution-Still running in Serial mode

What makes a man succeed?

Random body shuffle every night—can we still function?

What are the discoveries that have been possible with the rejection of positivism?

Misunderstanding of Sylow theory

Crossing US/Canada Border for less than 24 hours

What to do with repeated rejections for phd position

Induction Proof for Sequences



Subset based on most frequent value



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!How do I sort a dictionary by value?Why can't Python parse this JSON data?Peak detection in a 2D arrayHow to access environment variable values?What is the most efficient way to loop through dataframes with pandas?Most elegant way to check if the string is empty in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasFillna with most frequent if most frequent occurs else fillna with most frequent value of the entire columnHow to pivot a dataframe



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








2















Say I have the below dataset as a CSV file.



I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.



In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.



A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2

In this example, I want my output to be a CSV file of:

A B
! 1
@ 1
# 1
$ 1


But since the most frequent value will change, I'm not sure what my code should be.



Any help you can provide will be much appreciated. Thank you.










share|improve this question
























  • Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

    – 4Oh4
    Mar 8 at 22:46

















2















Say I have the below dataset as a CSV file.



I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.



In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.



A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2

In this example, I want my output to be a CSV file of:

A B
! 1
@ 1
# 1
$ 1


But since the most frequent value will change, I'm not sure what my code should be.



Any help you can provide will be much appreciated. Thank you.










share|improve this question
























  • Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

    – 4Oh4
    Mar 8 at 22:46













2












2








2








Say I have the below dataset as a CSV file.



I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.



In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.



A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2

In this example, I want my output to be a CSV file of:

A B
! 1
@ 1
# 1
$ 1


But since the most frequent value will change, I'm not sure what my code should be.



Any help you can provide will be much appreciated. Thank you.










share|improve this question
















Say I have the below dataset as a CSV file.



I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.



In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.



A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2

In this example, I want my output to be a CSV file of:

A B
! 1
@ 1
# 1
$ 1


But since the most frequent value will change, I'm not sure what my code should be.



Any help you can provide will be much appreciated. Thank you.







python pandas pandas-groupby






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 8 at 22:00









petezurich

3,89081936




3,89081936










asked Mar 8 at 21:44









Amie JohnsonAmie Johnson

254




254












  • Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

    – 4Oh4
    Mar 8 at 22:46

















  • Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

    – 4Oh4
    Mar 8 at 22:46
















Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

– 4Oh4
Mar 8 at 22:46





Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

– 4Oh4
Mar 8 at 22:46












1 Answer
1






active

oldest

votes


















4














We can use mode to return the value that appears most often and then filter on that value:



df[df['B']==df['B'].mode()[0]]


Output:



 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1


And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :



df[df['B']==df['B'].value_counts().index[N]]


e.g. for N=1:



df[df['B']==df['B'].value_counts().index[1]]


Output:



 A B
4 % 2
6 & 2
9 ) 2





share|improve this answer




















  • 1





    Thank you so much!! You are the best. I'm a new python user and am still learning.

    – Amie Johnson
    Mar 8 at 21:51











  • What if I wanted the 2nd or 3rd most frequent values?

    – Amie Johnson
    Mar 8 at 21:52











  • Updated my answer with how to filter on the Nth most frequent

    – perl
    Mar 8 at 21:59












  • Amazing, thank you again so much! Have a great weekend!

    – Amie Johnson
    Mar 8 at 22:07






  • 1





    @AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

    – TemporalWolf
    Mar 8 at 22:23












Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55071427%2fsubset-based-on-most-frequent-value%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









4














We can use mode to return the value that appears most often and then filter on that value:



df[df['B']==df['B'].mode()[0]]


Output:



 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1


And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :



df[df['B']==df['B'].value_counts().index[N]]


e.g. for N=1:



df[df['B']==df['B'].value_counts().index[1]]


Output:



 A B
4 % 2
6 & 2
9 ) 2





share|improve this answer




















  • 1





    Thank you so much!! You are the best. I'm a new python user and am still learning.

    – Amie Johnson
    Mar 8 at 21:51











  • What if I wanted the 2nd or 3rd most frequent values?

    – Amie Johnson
    Mar 8 at 21:52











  • Updated my answer with how to filter on the Nth most frequent

    – perl
    Mar 8 at 21:59












  • Amazing, thank you again so much! Have a great weekend!

    – Amie Johnson
    Mar 8 at 22:07






  • 1





    @AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

    – TemporalWolf
    Mar 8 at 22:23
















4














We can use mode to return the value that appears most often and then filter on that value:



df[df['B']==df['B'].mode()[0]]


Output:



 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1


And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :



df[df['B']==df['B'].value_counts().index[N]]


e.g. for N=1:



df[df['B']==df['B'].value_counts().index[1]]


Output:



 A B
4 % 2
6 & 2
9 ) 2





share|improve this answer




















  • 1





    Thank you so much!! You are the best. I'm a new python user and am still learning.

    – Amie Johnson
    Mar 8 at 21:51











  • What if I wanted the 2nd or 3rd most frequent values?

    – Amie Johnson
    Mar 8 at 21:52











  • Updated my answer with how to filter on the Nth most frequent

    – perl
    Mar 8 at 21:59












  • Amazing, thank you again so much! Have a great weekend!

    – Amie Johnson
    Mar 8 at 22:07






  • 1





    @AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

    – TemporalWolf
    Mar 8 at 22:23














4












4








4







We can use mode to return the value that appears most often and then filter on that value:



df[df['B']==df['B'].mode()[0]]


Output:



 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1


And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :



df[df['B']==df['B'].value_counts().index[N]]


e.g. for N=1:



df[df['B']==df['B'].value_counts().index[1]]


Output:



 A B
4 % 2
6 & 2
9 ) 2





share|improve this answer















We can use mode to return the value that appears most often and then filter on that value:



df[df['B']==df['B'].mode()[0]]


Output:



 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1


And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :



df[df['B']==df['B'].value_counts().index[N]]


e.g. for N=1:



df[df['B']==df['B'].value_counts().index[1]]


Output:



 A B
4 % 2
6 & 2
9 ) 2






share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 8 at 22:06

























answered Mar 8 at 21:48









perlperl

1,908416




1,908416







  • 1





    Thank you so much!! You are the best. I'm a new python user and am still learning.

    – Amie Johnson
    Mar 8 at 21:51











  • What if I wanted the 2nd or 3rd most frequent values?

    – Amie Johnson
    Mar 8 at 21:52











  • Updated my answer with how to filter on the Nth most frequent

    – perl
    Mar 8 at 21:59












  • Amazing, thank you again so much! Have a great weekend!

    – Amie Johnson
    Mar 8 at 22:07






  • 1





    @AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

    – TemporalWolf
    Mar 8 at 22:23













  • 1





    Thank you so much!! You are the best. I'm a new python user and am still learning.

    – Amie Johnson
    Mar 8 at 21:51











  • What if I wanted the 2nd or 3rd most frequent values?

    – Amie Johnson
    Mar 8 at 21:52











  • Updated my answer with how to filter on the Nth most frequent

    – perl
    Mar 8 at 21:59












  • Amazing, thank you again so much! Have a great weekend!

    – Amie Johnson
    Mar 8 at 22:07






  • 1





    @AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

    – TemporalWolf
    Mar 8 at 22:23








1




1





Thank you so much!! You are the best. I'm a new python user and am still learning.

– Amie Johnson
Mar 8 at 21:51





Thank you so much!! You are the best. I'm a new python user and am still learning.

– Amie Johnson
Mar 8 at 21:51













What if I wanted the 2nd or 3rd most frequent values?

– Amie Johnson
Mar 8 at 21:52





What if I wanted the 2nd or 3rd most frequent values?

– Amie Johnson
Mar 8 at 21:52













Updated my answer with how to filter on the Nth most frequent

– perl
Mar 8 at 21:59






Updated my answer with how to filter on the Nth most frequent

– perl
Mar 8 at 21:59














Amazing, thank you again so much! Have a great weekend!

– Amie Johnson
Mar 8 at 22:07





Amazing, thank you again so much! Have a great weekend!

– Amie Johnson
Mar 8 at 22:07




1




1





@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

– TemporalWolf
Mar 8 at 22:23






@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

– TemporalWolf
Mar 8 at 22:23




















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55071427%2fsubset-based-on-most-frequent-value%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

AWS Lex not identifying response if by a variable The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceEnforcing custom enumeration in AWS LEX for slot valuesHow to give response based on user response in Amazon Lex?Intercepting AWS Lambda Response to a AWS Lex QueryLex chat bot error: Reached second execution of fulfillment lambda on the same utteranceamazon lex showing invalid responseLambda response send back to Lex slot?Response card in Amazon lexAmazon Lex - Lambda response return HTML to botHow can I solve 424 (Failed Dependency) (python) obtained from Amazon lex?

Алба-Юлія

Захаров Федір Захарович