How to display the particular max row in pyspark dataframesGroupBy column and filter rows with maximum value in PysparkAdd one row to pandas DataFrameHow to change the order of DataFrame columns?How to drop rows of Pandas DataFrame whose value in certain columns is NaNHow do I get the row count of a Pandas dataframe?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasExtract values before and after position in a python listpython port scanner snippet return unexpected resultGroup/Cluster K-Fold CV with SklearnComparing values of two keys in a dictionary and saving the highest value

What is GPS' 19 year rollover and does it present a cybersecurity issue?

Could a US political party gain complete control over the government by removing checks & balances?

Does the radius of the Spirit Guardians spell depend on the size of the caster?

Should I join an office cleaning event for free?

Can I make popcorn with any corn?

How to use Pandas to get the count of every combination inclusive

how to create a data type and make it available in all Databases?

Why is the design of haulage companies so “special”?

What are these boxed doors outside store fronts in New York?

Concept of linear mappings are confusing me

Is Social Media Science Fiction?

Why has Russell's definition of numbers using equivalence classes been finally abandoned? ( If it has actually been abandoned).

least quadratic residue under GRH: an EXPLICIT bound

Prevent a directory in /tmp from being deleted

Are tax years 2016 & 2017 back taxes deductible for tax year 2018?

Download, install and reboot computer at night if needed

Why is "Reports" in sentence down without "The"

The use of multiple foreign keys on same column in SQL Server

Finding files for which a command fails

Copycat chess is back

I see my dog run

How old can references or sources in a thesis be?

Why do we use polarized capacitor?

Calculus Optimization - Point on graph closest to given point

How to display the particular max row in pyspark dataframes

GroupBy column and filter rows with maximum value in PysparkAdd one row to pandas DataFrameHow to change the order of DataFrame columns?How to drop rows of Pandas DataFrame whose value in certain columns is NaNHow do I get the row count of a Pandas dataframe?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasExtract values before and after position in a python listpython port scanner snippet return unexpected resultGroup/Cluster K-Fold CV with SklearnComparing values of two keys in a dictionary and saving the highest value

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I have the following code

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total Births'))
 .show()

The above groups the sum of age_specific_birth_rate by Period

So the output will be like

Period|Total Births|
+------+------------+
| 2000| 395.5|
| 2001| 393.4|
| 2002| 377.3|
| 2003| 386.2|
| 2004| 395.9|
| 2005| 391.9|
| 2006| 400.4|
| 2007| 434.0|
| 2008| 437.8|
| 2009| 425.7|
| 2010| 434.0|
| 2011| 417.8|
| 2012| 418.2|
| 2013| 400.4|
| 2014| 384.3|
| 2015| 398.7|
| 2016| 374.8|
| 2017| 362.7|
| 2018| 342.2|

But I wanna display the maximum among this by Period

so when I type in the follwing code

 ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .select('Period', 'Total')
 .agg(max('Total'))
 .show()

I get the output

> +----------+
 |max(Total)|
 +----------+
 | 437.8| 
 +----------+

But I wanna get something like

 +------+------------+
 |Period|max(Total) |
 +------+------------+
 | 2008| 395.5|

What shuold I do ?

Thank you

edited Mar 8 at 6:31

howie

919920

asked Mar 8 at 6:13

Rudy

1

That's a common problem. You want to output max value and the line that contains max value. Alternative way is for-loop your data and compare each one with max value, if they are equal then output this. There is probably multi answer.

– MoreFreeze
Mar 8 at 6:23

Can you put a small initial dataset as an example and the output expected for that dataset to be able to reproduce and understand the case?

– Daniel Sobrado
Mar 8 at 6:25

Possible duplicate of GroupBy column and filter rows with maximum value in Pyspark

– pault
Mar 8 at 14:51

add a comment |

I have the following code

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total Births'))
 .show()

The above groups the sum of age_specific_birth_rate by Period

So the output will be like

Period|Total Births|
+------+------------+
| 2000| 395.5|
| 2001| 393.4|
| 2002| 377.3|
| 2003| 386.2|
| 2004| 395.9|
| 2005| 391.9|
| 2006| 400.4|
| 2007| 434.0|
| 2008| 437.8|
| 2009| 425.7|
| 2010| 434.0|
| 2011| 417.8|
| 2012| 418.2|
| 2013| 400.4|
| 2014| 384.3|
| 2015| 398.7|
| 2016| 374.8|
| 2017| 362.7|
| 2018| 342.2|

But I wanna display the maximum among this by Period

so when I type in the follwing code

 ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .select('Period', 'Total')
 .agg(max('Total'))
 .show()

I get the output

> +----------+
 |max(Total)|
 +----------+
 | 437.8| 
 +----------+

But I wanna get something like

 +------+------------+
 |Period|max(Total) |
 +------+------------+
 | 2008| 395.5|

What shuold I do ?

Thank you

edited Mar 8 at 6:31

howie

919920

asked Mar 8 at 6:13

Rudy

1

That's a common problem. You want to output max value and the line that contains max value. Alternative way is for-loop your data and compare each one with max value, if they are equal then output this. There is probably multi answer.

– MoreFreeze
Mar 8 at 6:23

Can you put a small initial dataset as an example and the output expected for that dataset to be able to reproduce and understand the case?

– Daniel Sobrado
Mar 8 at 6:25

Possible duplicate of GroupBy column and filter rows with maximum value in Pyspark

– pault
Mar 8 at 14:51

add a comment |

I have the following code

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total Births'))
 .show()

The above groups the sum of age_specific_birth_rate by Period

So the output will be like

Period|Total Births|
+------+------------+
| 2000| 395.5|
| 2001| 393.4|
| 2002| 377.3|
| 2003| 386.2|
| 2004| 395.9|
| 2005| 391.9|
| 2006| 400.4|
| 2007| 434.0|
| 2008| 437.8|
| 2009| 425.7|
| 2010| 434.0|
| 2011| 417.8|
| 2012| 418.2|
| 2013| 400.4|
| 2014| 384.3|
| 2015| 398.7|
| 2016| 374.8|
| 2017| 362.7|
| 2018| 342.2|

But I wanna display the maximum among this by Period

so when I type in the follwing code

 ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .select('Period', 'Total')
 .agg(max('Total'))
 .show()

I get the output

> +----------+
 |max(Total)|
 +----------+
 | 437.8| 
 +----------+

But I wanna get something like

 +------+------------+
 |Period|max(Total) |
 +------+------------+
 | 2008| 395.5|

What shuold I do ?

Thank you

edited Mar 8 at 6:31

howie

919920

asked Mar 8 at 6:13

Rudy

I have the following code

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total Births'))
 .show()

The above groups the sum of age_specific_birth_rate by Period

So the output will be like

Period|Total Births|
+------+------------+
| 2000| 395.5|
| 2001| 393.4|
| 2002| 377.3|
| 2003| 386.2|
| 2004| 395.9|
| 2005| 391.9|
| 2006| 400.4|
| 2007| 434.0|
| 2008| 437.8|
| 2009| 425.7|
| 2010| 434.0|
| 2011| 417.8|
| 2012| 418.2|
| 2013| 400.4|
| 2014| 384.3|
| 2015| 398.7|
| 2016| 374.8|
| 2017| 362.7|
| 2018| 342.2|

But I wanna display the maximum among this by Period

so when I type in the follwing code

 ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .select('Period', 'Total')
 .agg(max('Total'))
 .show()

I get the output

> +----------+
 |max(Total)|
 +----------+
 | 437.8| 
 +----------+

But I wanna get something like

 +------+------------+
 |Period|max(Total) |
 +------+------------+
 | 2008| 395.5|

What shuold I do ?

Thank you

python pyspark apache-spark-sql

edited Mar 8 at 6:31

howie

919920

asked Mar 8 at 6:13

Rudy

edited Mar 8 at 6:31

howie

919920

asked Mar 8 at 6:13

Rudy

edited Mar 8 at 6:31

howie

919920

edited Mar 8 at 6:31

howie

919920

edited Mar 8 at 6:31

howie

919920

asked Mar 8 at 6:13

Rudy

asked Mar 8 at 6:13

Rudy

asked Mar 8 at 6:13

Rudy

1

That's a common problem. You want to output max value and the line that contains max value. Alternative way is for-loop your data and compare each one with max value, if they are equal then output this. There is probably multi answer.

– MoreFreeze
Mar 8 at 6:23

Can you put a small initial dataset as an example and the output expected for that dataset to be able to reproduce and understand the case?

– Daniel Sobrado
Mar 8 at 6:25

Possible duplicate of GroupBy column and filter rows with maximum value in Pyspark

– pault
Mar 8 at 14:51

add a comment |

1

That's a common problem. You want to output max value and the line that contains max value. Alternative way is for-loop your data and compare each one with max value, if they are equal then output this. There is probably multi answer.

– MoreFreeze
Mar 8 at 6:23

Can you put a small initial dataset as an example and the output expected for that dataset to be able to reproduce and understand the case?

– Daniel Sobrado
Mar 8 at 6:25

Possible duplicate of GroupBy column and filter rows with maximum value in Pyspark

– pault
Mar 8 at 14:51

That's a common problem. You want to output max value and the line that contains max value. Alternative way is for-loop your data and compare each one with max value, if they are equal then output this. There is probably multi answer.

– MoreFreeze
Mar 8 at 6:23

Can you put a small initial dataset as an example and the output expected for that dataset to be able to reproduce and understand the case?

– Daniel Sobrado
Mar 8 at 6:25

Possible duplicate of GroupBy column and filter rows with maximum value in Pyspark

– pault
Mar 8 at 14:51

add a comment |

1 Answer
1

active

oldest

votes

You can try

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .orderBy(functions.col('Total').desc())
 .limit(1)
 .select('Period', 'Total')
 .show()

answered Mar 8 at 6:27

howie

919920

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55057694%2fhow-to-display-the-particular-max-row-in-pyspark-dataframes%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You can try

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .orderBy(functions.col('Total').desc())
 .limit(1)
 .select('Period', 'Total')
 .show()

answered Mar 8 at 6:27

howie

919920

add a comment |

You can try

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .orderBy(functions.col('Total').desc())
 .limit(1)
 .select('Period', 'Total')
 .show()

answered Mar 8 at 6:27

howie

919920

add a comment |

You can try

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .orderBy(functions.col('Total').desc())
 .limit(1)
 .select('Period', 'Total')
 .show()

answered Mar 8 at 6:27

howie

919920

You can try

ageDF.sort('Period')
 .groupBy('Period')
 .agg(round(sum('Age_specific_birth_rate'), 2).alias('Total'))
 .orderBy(functions.col('Total').desc())
 .limit(1)
 .select('Period', 'Total')
 .show()

answered Mar 8 at 6:27

howie

919920

answered Mar 8 at 6:27

howie

919920

answered Mar 8 at 6:27

howie

919920

answered Mar 8 at 6:27

howie

919920

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ufdjrw

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

1 Answer
1

1 Answer
1

1 Answer
1