Subset based on most frequent value Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!How do I sort a dictionary by value?Why can't Python parse this JSON data?Peak detection in a 2D arrayHow to access environment variable values?What is the most efficient way to loop through dataframes with pandas?Most elegant way to check if the string is empty in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasFillna with most frequent if most frequent occurs else fillna with most frequent value of the entire columnHow to pivot a dataframe

Can a Beast Master ranger change beast companions?

Sentence with dass with three Verbs (One modal and two connected with zu)

Putting class ranking in CV, but against dept guidelines

Does the Mueller report show a conspiracy between Russia and the Trump Campaign?

macOS: Name for app shortcut screen found by pinching with thumb and three fingers

A letter with no particular backstory

What is an "asse" in Elizabethan English?

How were pictures turned from film to a big picture in a picture frame before digital scanning?

How could we fake a moon landing now?

Why do early math courses focus on the cross sections of a cone and not on other 3D objects?

Lagrange four-squares theorem --- deterministic complexity

Amount of permutations on an NxNxN Rubik's Cube

An adverb for when you're not exaggerating

What initially awakened the Balrog?

What is the difference between a "ranged attack" and a "ranged weapon attack"?

What does Turing mean by this statement?

Converted a Scalar function to a TVF function for parallel execution-Still running in Serial mode

What makes a man succeed?

Random body shuffle every night—can we still function?

What are the discoveries that have been possible with the rejection of positivism?

Misunderstanding of Sylow theory

Crossing US/Canada Border for less than 24 hours

What to do with repeated rejections for phd position

Induction Proof for Sequences

Subset based on most frequent value

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 23, 2019 at 23:30UTC (7:30pm US/Eastern)

Data science time! April 2019 and salary with experience

The Ask Question Wizard is Live!How do I sort a dictionary by value?Why can't Python parse this JSON data?Peak detection in a 2D arrayHow to access environment variable values?What is the most efficient way to loop through dataframes with pandas?Most elegant way to check if the string is empty in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasFillna with most frequent if most frequent occurs else fillna with most frequent value of the entire columnHow to pivot a dataframe

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

Say I have the below dataset as a CSV file.

I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.

In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.

A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2

In this example, I want my output to be a CSV file of:

A B
! 1
@ 1
# 1 
$ 1

But since the most frequent value will change, I'm not sure what my code should be.

Any help you can provide will be much appreciated. Thank you.

edited Mar 8 at 22:00

petezurich

3,89081936

asked Mar 8 at 21:44

Amie Johnson

254

Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

– 4Oh4
Mar 8 at 22:46

add a comment |

Say I have the below dataset as a CSV file.

I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.

In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.

A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2

In this example, I want my output to be a CSV file of:

A B
! 1
@ 1
# 1 
$ 1

But since the most frequent value will change, I'm not sure what my code should be.

Any help you can provide will be much appreciated. Thank you.

edited Mar 8 at 22:00

petezurich

3,89081936

asked Mar 8 at 21:44

Amie Johnson

254

Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

– 4Oh4
Mar 8 at 22:46

add a comment |

Say I have the below dataset as a CSV file.

I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.

In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.

A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2

In this example, I want my output to be a CSV file of:

A B
! 1
@ 1
# 1 
$ 1

But since the most frequent value will change, I'm not sure what my code should be.

Any help you can provide will be much appreciated. Thank you.

edited Mar 8 at 22:00

petezurich

3,89081936

asked Mar 8 at 21:44

Amie Johnson

254

Say I have the below dataset as a CSV file.

I want my output to be a CSV file which is a subset of this data based on only the data associated with the most frequent value in column B.

In the below example data, the most frequent value in column B is "1", however this will change and so I need my code to not be so specific.

A B
! 1
@ 1
# 1
$ 1
% 2
^ 3
& 2
* 4
( 5
) 2

In this example, I want my output to be a CSV file of:

A B
! 1
@ 1
# 1 
$ 1

But since the most frequent value will change, I'm not sure what my code should be.

Any help you can provide will be much appreciated. Thank you.

python pandas pandas-groupby

edited Mar 8 at 22:00

petezurich

3,89081936

asked Mar 8 at 21:44

Amie Johnson

254

edited Mar 8 at 22:00

petezurich

3,89081936

asked Mar 8 at 21:44

Amie Johnson

254

edited Mar 8 at 22:00

petezurich

3,89081936

edited Mar 8 at 22:00

petezurich

3,89081936

edited Mar 8 at 22:00

petezurich

3,89081936

asked Mar 8 at 21:44

Amie Johnson

254

asked Mar 8 at 21:44

Amie Johnson

254

asked Mar 8 at 21:44

Amie Johnson

254

Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

– 4Oh4
Mar 8 at 22:46

add a comment |

Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

– 4Oh4
Mar 8 at 22:46

Welcome to StackOverflow. Just to clarify - if I was to summarise what you are trying to achieve, you would: 1) find most frequent value in B; 2) discard all rows where B is not the most frequent value ?

– 4Oh4
Mar 8 at 22:46

add a comment |

1 Answer
1

active

oldest

votes

We can use mode to return the value that appears most often and then filter on that value:

df[df['B']==df['B'].mode()[0]]

Output:

 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1

And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :

df[df['B']==df['B'].value_counts().index[N]]

e.g. for N=1:

df[df['B']==df['B'].value_counts().index[1]]

Output:

 A B
4 % 2
6 & 2
9 ) 2

edited Mar 8 at 22:06

answered Mar 8 at 21:48

perl

1,908416

1

Thank you so much!! You are the best. I'm a new python user and am still learning.

– Amie Johnson
Mar 8 at 21:51

What if I wanted the 2nd or 3rd most frequent values?

– Amie Johnson
Mar 8 at 21:52

Updated my answer with how to filter on the Nth most frequent

– perl
Mar 8 at 21:59

Amazing, thank you again so much! Have a great weekend!

– Amie Johnson
Mar 8 at 22:07

1

@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

– TemporalWolf
Mar 8 at 22:23

|
show 1 more comment

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55071427%2fsubset-based-on-most-frequent-value%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

We can use mode to return the value that appears most often and then filter on that value:

df[df['B']==df['B'].mode()[0]]

Output:

 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1

And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :

df[df['B']==df['B'].value_counts().index[N]]

e.g. for N=1:

df[df['B']==df['B'].value_counts().index[1]]

Output:

 A B
4 % 2
6 & 2
9 ) 2

edited Mar 8 at 22:06

answered Mar 8 at 21:48

perl

1,908416

1

Thank you so much!! You are the best. I'm a new python user and am still learning.

– Amie Johnson
Mar 8 at 21:51

What if I wanted the 2nd or 3rd most frequent values?

– Amie Johnson
Mar 8 at 21:52

Updated my answer with how to filter on the Nth most frequent

– perl
Mar 8 at 21:59

Amazing, thank you again so much! Have a great weekend!

– Amie Johnson
Mar 8 at 22:07

1

@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

– TemporalWolf
Mar 8 at 22:23

|
show 1 more comment

We can use mode to return the value that appears most often and then filter on that value:

df[df['B']==df['B'].mode()[0]]

Output:

 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1

And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :

df[df['B']==df['B'].value_counts().index[N]]

e.g. for N=1:

df[df['B']==df['B'].value_counts().index[1]]

Output:

 A B
4 % 2
6 & 2
9 ) 2

edited Mar 8 at 22:06

answered Mar 8 at 21:48

perl

1,908416

1

Thank you so much!! You are the best. I'm a new python user and am still learning.

– Amie Johnson
Mar 8 at 21:51

What if I wanted the 2nd or 3rd most frequent values?

– Amie Johnson
Mar 8 at 21:52

Updated my answer with how to filter on the Nth most frequent

– perl
Mar 8 at 21:59

Amazing, thank you again so much! Have a great weekend!

– Amie Johnson
Mar 8 at 22:07

1

@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

– TemporalWolf
Mar 8 at 22:23

|
show 1 more comment

We can use mode to return the value that appears most often and then filter on that value:

df[df['B']==df['B'].mode()[0]]

Output:

 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1

And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :

df[df['B']==df['B'].value_counts().index[N]]

e.g. for N=1:

df[df['B']==df['B'].value_counts().index[1]]

Output:

 A B
4 % 2
6 & 2
9 ) 2

edited Mar 8 at 22:06

answered Mar 8 at 21:48

perl

1,908416

We can use mode to return the value that appears most often and then filter on that value:

df[df['B']==df['B'].mode()[0]]

Output:

 A B
0 ! 1
1 @ 1
2 # 1
3 $ 1

And value_counts can be used for the Nth most frequent value (starting with N=0 being the most frequent) :

df[df['B']==df['B'].value_counts().index[N]]

e.g. for N=1:

df[df['B']==df['B'].value_counts().index[1]]

Output:

 A B
4 % 2
6 & 2
9 ) 2

edited Mar 8 at 22:06

answered Mar 8 at 21:48

perl

1,908416

edited Mar 8 at 22:06

answered Mar 8 at 21:48

perl

1,908416

answered Mar 8 at 21:48

perl

1,908416

answered Mar 8 at 21:48

perl

1,908416

1

Thank you so much!! You are the best. I'm a new python user and am still learning.

– Amie Johnson
Mar 8 at 21:51

What if I wanted the 2nd or 3rd most frequent values?

– Amie Johnson
Mar 8 at 21:52

Updated my answer with how to filter on the Nth most frequent

– perl
Mar 8 at 21:59

Amazing, thank you again so much! Have a great weekend!

– Amie Johnson
Mar 8 at 22:07

1

@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

– TemporalWolf
Mar 8 at 22:23

|
show 1 more comment

1

Thank you so much!! You are the best. I'm a new python user and am still learning.

– Amie Johnson
Mar 8 at 21:51

What if I wanted the 2nd or 3rd most frequent values?

– Amie Johnson
Mar 8 at 21:52

Updated my answer with how to filter on the Nth most frequent

– perl
Mar 8 at 21:59

Amazing, thank you again so much! Have a great weekend!

– Amie Johnson
Mar 8 at 22:07

1

@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

– TemporalWolf
Mar 8 at 22:23

Thank you so much!! You are the best. I'm a new python user and am still learning.

– Amie Johnson
Mar 8 at 21:51

What if I wanted the 2nd or 3rd most frequent values?

– Amie Johnson
Mar 8 at 21:52

Updated my answer with how to filter on the Nth most frequent

– perl
Mar 8 at 21:59

Amazing, thank you again so much! Have a great weekend!

– Amie Johnson
Mar 8 at 22:07

@AmieJohnson (and perl, you should both get a notification): Welcome to the both of you to Stack Overflow! Please try to use comments as intended, see when not to comment: while "Thanks!" is polite, it's noise for any future readers. If you like this answer, please upvote it (and accept it if it answers your question) and pay it forward by helping others if you can!

– TemporalWolf
Mar 8 at 22:23

|
show 1 more comment

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ufdjrw

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

1 Answer
1

1 Answer
1

1 Answer
1