Set column name for apply result over groupby Unicorn Meta Zoo #1: Why another podcast? Announcing the arrival of Valued Associate #679: Cesar Manara Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Panda dataframe conditional .mean() depending on values in certain columnInconsistency in results of aggregating pandas groupby object using numpy.median vs other functionsDelete column from pandas DataFrame by column nameChanging DataFrame column names also changes column typeRenaming Column Names in Pandas Groupby functionPassing multiple columns as arguments to aggregation function groupbyPandas: Summing arrays as as an aggregation with multiple groupby columnsHow to rename an aggregate column in groupby in pandasParse dataframe with specific column and write to sheets in one excel fileDataframe: select different index for each columnsselenium pandas dataframe constructor not properly called
Israeli soda type drink
Where to find documentation for `whois` command options?
Why isPrototypeOf() returns false?
What to do with someone that cheated their way though university and a PhD program?
RIP Packet Format
Coin Game with infinite paradox
Was Objective-C really a hindrance to Apple software development?
How would it unbalance gameplay to rule that Weapon Master allows for picking a fighting style?
Where can I find how to tex symbols for different fonts?
false 'Security alert' from Google - every login generates mails from 'no-reply@accounts.google.com'
How did Elite on the NES work?
Will I lose my paid in full property
Simulate round-robin tournament draw
Why does the Cisco show run command not show the full version, while the show version command does?
Does using the Inspiration rules for character defects encourage My Guy Syndrome?
Can gravitational waves pass through a black hole?
Why aren't road bicycle wheels tiny?
Has a Nobel Peace laureate ever been accused of war crimes?
Processing ADC conversion result: DMA vs Processor Registers
Getting AggregateResult variables from Execute Anonymous Window
What is the evidence that custom checks in Northern Ireland are going to result in violence?
Why did Europeans not widely domesticate foxes?
Was there ever a LEGO store in Miami International Airport?
Stretch a Tikz tree
Set column name for apply result over groupby
Unicorn Meta Zoo #1: Why another podcast?
Announcing the arrival of Valued Associate #679: Cesar Manara
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!Panda dataframe conditional .mean() depending on values in certain columnInconsistency in results of aggregating pandas groupby object using numpy.median vs other functionsDelete column from pandas DataFrame by column nameChanging DataFrame column names also changes column typeRenaming Column Names in Pandas Groupby functionPassing multiple columns as arguments to aggregation function groupbyPandas: Summing arrays as as an aggregation with multiple groupby columnsHow to rename an aggregate column in groupby in pandasParse dataframe with specific column and write to sheets in one excel fileDataframe: select different index for each columnsselenium pandas dataframe constructor not properly called
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
This is a fairly trivial problem, but its triggering my OCD and I haven't been able to find a suitable solution for the past half hour.
For background, I'm looking to calculate a value (let's call it F) for each group in a DataFrame derived from different aggregated measures of columns in the existing DataFrame.
Here's a toy example of what I'm trying to do:
import pandas as pd
import numpy as np
df = pd.DataFrame('A': ['X', 'Y', 'X', 'Y', 'Y', 'Y', 'Y', 'X', 'Y', 'X'],
'B': ['N', 'N', 'N', 'M', 'N', 'M', 'M', 'N', 'M', 'N'],
'C': [69, 83, 28, 25, 11, 31, 14, 37, 14, 0],
'D': [ 0.3, 0.1, 0.1, 0.8, 0.8, 0. , 0.8, 0.8, 0.1, 0.8],
'E': [11, 11, 12, 11, 11, 12, 12, 11, 12, 12]
)
df_grp = df.groupby(['A','B'])
df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
What I'd like to do is assign a name to the result of apply
(or lambda
). Is there anyway to do this without moving lambda
to a named function or renaming the column after running the last line?
python pandas
|
show 2 more comments
This is a fairly trivial problem, but its triggering my OCD and I haven't been able to find a suitable solution for the past half hour.
For background, I'm looking to calculate a value (let's call it F) for each group in a DataFrame derived from different aggregated measures of columns in the existing DataFrame.
Here's a toy example of what I'm trying to do:
import pandas as pd
import numpy as np
df = pd.DataFrame('A': ['X', 'Y', 'X', 'Y', 'Y', 'Y', 'Y', 'X', 'Y', 'X'],
'B': ['N', 'N', 'N', 'M', 'N', 'M', 'M', 'N', 'M', 'N'],
'C': [69, 83, 28, 25, 11, 31, 14, 37, 14, 0],
'D': [ 0.3, 0.1, 0.1, 0.8, 0.8, 0. , 0.8, 0.8, 0.1, 0.8],
'E': [11, 11, 12, 11, 11, 12, 12, 11, 12, 12]
)
df_grp = df.groupby(['A','B'])
df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
What I'd like to do is assign a name to the result of apply
(or lambda
). Is there anyway to do this without moving lambda
to a named function or renaming the column after running the last line?
python pandas
What is your expected output for the toy data?
– Zero
Apr 22 '15 at 15:24
5.583333, 2.975000, 3.845455
, which is what the function returns.
– MrT
Apr 22 '15 at 15:28
Like stackoverflow.com/a/29778475/2137255 ?
– Zero
Apr 22 '15 at 15:33
Essentially. Is there a way of assigning a name to the result short of defining the function? I'd prefer to uselambda
.
– MrT
Apr 22 '15 at 15:41
Actually, looking at that link again, its not exactly what I want. I need the result at the group level only, not the original DataFrame.
– MrT
Apr 22 '15 at 15:52
|
show 2 more comments
This is a fairly trivial problem, but its triggering my OCD and I haven't been able to find a suitable solution for the past half hour.
For background, I'm looking to calculate a value (let's call it F) for each group in a DataFrame derived from different aggregated measures of columns in the existing DataFrame.
Here's a toy example of what I'm trying to do:
import pandas as pd
import numpy as np
df = pd.DataFrame('A': ['X', 'Y', 'X', 'Y', 'Y', 'Y', 'Y', 'X', 'Y', 'X'],
'B': ['N', 'N', 'N', 'M', 'N', 'M', 'M', 'N', 'M', 'N'],
'C': [69, 83, 28, 25, 11, 31, 14, 37, 14, 0],
'D': [ 0.3, 0.1, 0.1, 0.8, 0.8, 0. , 0.8, 0.8, 0.1, 0.8],
'E': [11, 11, 12, 11, 11, 12, 12, 11, 12, 12]
)
df_grp = df.groupby(['A','B'])
df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
What I'd like to do is assign a name to the result of apply
(or lambda
). Is there anyway to do this without moving lambda
to a named function or renaming the column after running the last line?
python pandas
This is a fairly trivial problem, but its triggering my OCD and I haven't been able to find a suitable solution for the past half hour.
For background, I'm looking to calculate a value (let's call it F) for each group in a DataFrame derived from different aggregated measures of columns in the existing DataFrame.
Here's a toy example of what I'm trying to do:
import pandas as pd
import numpy as np
df = pd.DataFrame('A': ['X', 'Y', 'X', 'Y', 'Y', 'Y', 'Y', 'X', 'Y', 'X'],
'B': ['N', 'N', 'N', 'M', 'N', 'M', 'M', 'N', 'M', 'N'],
'C': [69, 83, 28, 25, 11, 31, 14, 37, 14, 0],
'D': [ 0.3, 0.1, 0.1, 0.8, 0.8, 0. , 0.8, 0.8, 0.1, 0.8],
'E': [11, 11, 12, 11, 11, 12, 12, 11, 12, 12]
)
df_grp = df.groupby(['A','B'])
df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
What I'd like to do is assign a name to the result of apply
(or lambda
). Is there anyway to do this without moving lambda
to a named function or renaming the column after running the last line?
python pandas
python pandas
edited Mar 9 at 4:29
JJJ
75011221
75011221
asked Apr 22 '15 at 15:21
MrTMrT
25315
25315
What is your expected output for the toy data?
– Zero
Apr 22 '15 at 15:24
5.583333, 2.975000, 3.845455
, which is what the function returns.
– MrT
Apr 22 '15 at 15:28
Like stackoverflow.com/a/29778475/2137255 ?
– Zero
Apr 22 '15 at 15:33
Essentially. Is there a way of assigning a name to the result short of defining the function? I'd prefer to uselambda
.
– MrT
Apr 22 '15 at 15:41
Actually, looking at that link again, its not exactly what I want. I need the result at the group level only, not the original DataFrame.
– MrT
Apr 22 '15 at 15:52
|
show 2 more comments
What is your expected output for the toy data?
– Zero
Apr 22 '15 at 15:24
5.583333, 2.975000, 3.845455
, which is what the function returns.
– MrT
Apr 22 '15 at 15:28
Like stackoverflow.com/a/29778475/2137255 ?
– Zero
Apr 22 '15 at 15:33
Essentially. Is there a way of assigning a name to the result short of defining the function? I'd prefer to uselambda
.
– MrT
Apr 22 '15 at 15:41
Actually, looking at that link again, its not exactly what I want. I need the result at the group level only, not the original DataFrame.
– MrT
Apr 22 '15 at 15:52
What is your expected output for the toy data?
– Zero
Apr 22 '15 at 15:24
What is your expected output for the toy data?
– Zero
Apr 22 '15 at 15:24
5.583333, 2.975000, 3.845455
, which is what the function returns.– MrT
Apr 22 '15 at 15:28
5.583333, 2.975000, 3.845455
, which is what the function returns.– MrT
Apr 22 '15 at 15:28
Like stackoverflow.com/a/29778475/2137255 ?
– Zero
Apr 22 '15 at 15:33
Like stackoverflow.com/a/29778475/2137255 ?
– Zero
Apr 22 '15 at 15:33
Essentially. Is there a way of assigning a name to the result short of defining the function? I'd prefer to use
lambda
.– MrT
Apr 22 '15 at 15:41
Essentially. Is there a way of assigning a name to the result short of defining the function? I'd prefer to use
lambda
.– MrT
Apr 22 '15 at 15:41
Actually, looking at that link again, its not exactly what I want. I need the result at the group level only, not the original DataFrame.
– MrT
Apr 22 '15 at 15:52
Actually, looking at that link again, its not exactly what I want. I need the result at the group level only, not the original DataFrame.
– MrT
Apr 22 '15 at 15:52
|
show 2 more comments
2 Answers
2
active
oldest
votes
Have the lambda function return a new Series:
df_grp.apply(lambda x: pd.Series('new_name':
x['C'].sum() * x['D'].mean() / x['E'].max()))
new_name
A B
X N 5.583333
Y M 2.975000
N 3.845455
2
Upvoted. This is a great general solution.
– MrT
Apr 22 '15 at 16:40
Scalable for multiple series too, thanks
– User632716
May 15 '18 at 10:56
add a comment |
You could convert your series
to a dataframe
using reset_index()
and provide name='yout_col_name'
-- The name of the column corresponding to the Series values
(df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
.reset_index(name='your_col_name'))
A B your_col_name
0 X N 5.583333
1 Y M 2.975000
2 Y N 3.845455
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f29802034%2fset-column-name-for-apply-result-over-groupby%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Have the lambda function return a new Series:
df_grp.apply(lambda x: pd.Series('new_name':
x['C'].sum() * x['D'].mean() / x['E'].max()))
new_name
A B
X N 5.583333
Y M 2.975000
N 3.845455
2
Upvoted. This is a great general solution.
– MrT
Apr 22 '15 at 16:40
Scalable for multiple series too, thanks
– User632716
May 15 '18 at 10:56
add a comment |
Have the lambda function return a new Series:
df_grp.apply(lambda x: pd.Series('new_name':
x['C'].sum() * x['D'].mean() / x['E'].max()))
new_name
A B
X N 5.583333
Y M 2.975000
N 3.845455
2
Upvoted. This is a great general solution.
– MrT
Apr 22 '15 at 16:40
Scalable for multiple series too, thanks
– User632716
May 15 '18 at 10:56
add a comment |
Have the lambda function return a new Series:
df_grp.apply(lambda x: pd.Series('new_name':
x['C'].sum() * x['D'].mean() / x['E'].max()))
new_name
A B
X N 5.583333
Y M 2.975000
N 3.845455
Have the lambda function return a new Series:
df_grp.apply(lambda x: pd.Series('new_name':
x['C'].sum() * x['D'].mean() / x['E'].max()))
new_name
A B
X N 5.583333
Y M 2.975000
N 3.845455
edited May 11 '18 at 7:58
smci
15.7k679110
15.7k679110
answered Apr 22 '15 at 16:22
AlexanderAlexander
56.6k1494128
56.6k1494128
2
Upvoted. This is a great general solution.
– MrT
Apr 22 '15 at 16:40
Scalable for multiple series too, thanks
– User632716
May 15 '18 at 10:56
add a comment |
2
Upvoted. This is a great general solution.
– MrT
Apr 22 '15 at 16:40
Scalable for multiple series too, thanks
– User632716
May 15 '18 at 10:56
2
2
Upvoted. This is a great general solution.
– MrT
Apr 22 '15 at 16:40
Upvoted. This is a great general solution.
– MrT
Apr 22 '15 at 16:40
Scalable for multiple series too, thanks
– User632716
May 15 '18 at 10:56
Scalable for multiple series too, thanks
– User632716
May 15 '18 at 10:56
add a comment |
You could convert your series
to a dataframe
using reset_index()
and provide name='yout_col_name'
-- The name of the column corresponding to the Series values
(df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
.reset_index(name='your_col_name'))
A B your_col_name
0 X N 5.583333
1 Y M 2.975000
2 Y N 3.845455
add a comment |
You could convert your series
to a dataframe
using reset_index()
and provide name='yout_col_name'
-- The name of the column corresponding to the Series values
(df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
.reset_index(name='your_col_name'))
A B your_col_name
0 X N 5.583333
1 Y M 2.975000
2 Y N 3.845455
add a comment |
You could convert your series
to a dataframe
using reset_index()
and provide name='yout_col_name'
-- The name of the column corresponding to the Series values
(df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
.reset_index(name='your_col_name'))
A B your_col_name
0 X N 5.583333
1 Y M 2.975000
2 Y N 3.845455
You could convert your series
to a dataframe
using reset_index()
and provide name='yout_col_name'
-- The name of the column corresponding to the Series values
(df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
.reset_index(name='your_col_name'))
A B your_col_name
0 X N 5.583333
1 Y M 2.975000
2 Y N 3.845455
answered Apr 22 '15 at 16:04
ZeroZero
41k87693
41k87693
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f29802034%2fset-column-name-for-apply-result-over-groupby%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What is your expected output for the toy data?
– Zero
Apr 22 '15 at 15:24
5.583333, 2.975000, 3.845455
, which is what the function returns.– MrT
Apr 22 '15 at 15:28
Like stackoverflow.com/a/29778475/2137255 ?
– Zero
Apr 22 '15 at 15:33
Essentially. Is there a way of assigning a name to the result short of defining the function? I'd prefer to use
lambda
.– MrT
Apr 22 '15 at 15:41
Actually, looking at that link again, its not exactly what I want. I need the result at the group level only, not the original DataFrame.
– MrT
Apr 22 '15 at 15:52