How to get cumulative sum of unique IDs with group by? The Next CEO of Stack OverflowHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?How can I safely create a nested directory in Python?How to get the current time in PythonHow do I sort a dictionary by value?How to make a chain of function decorators?How to make a flat list out of list of lists?How do I get the number of elements in a list in Python?How do I list all files of a directory?
"Eavesdropping" vs "Listen in on"
Is there a rule of thumb for determining the amount one should accept for a settlement offer?
How do I secure a TV wall mount?
How can I separate the number from the unit in argument?
Masking layers by a vector polygon layer in QGIS
Oldie but Goldie
Create custom note boxes
How seriously should I take size and weight limits of hand luggage?
Is it possible to create a QR code using text?
Incomplete cube
Simplify trigonometric expression using trigonometric identities
Cannot restore registry to default in Windows 10?
Traveling with my 5 year old daughter (as the father) without the mother from Germany to Mexico
Why did the Drakh emissary look so blurred in S04:E11 "Lines of Communication"?
Why doesn't Shulchan Aruch include the laws of destroying fruit trees?
Man transported from Alternate World into ours by a Neutrino Detector
logical reads on global temp table, but not on session-level temp table
Another proof that dividing by 0 does not exist -- is it right?
Early programmable calculators with RS-232
Car headlights in a world without electricity
Which acid/base does a strong base/acid react when added to a buffer solution?
How does a dynamic QR code work?
Calculate the Mean mean of two numbers
How to find if SQL server backup is encrypted with TDE without restoring the backup
How to get cumulative sum of unique IDs with group by?
The Next CEO of Stack OverflowHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?How can I safely create a nested directory in Python?How to get the current time in PythonHow do I sort a dictionary by value?How to make a chain of function decorators?How to make a flat list out of list of lists?How do I get the number of elements in a list in Python?How do I list all files of a directory?
I am very new to python and pandas working on a pandas dataframe which looks like
Date Time ID Weight
Jul-1 12:00 A 10
Jul-1 12:00 B 20
Jul-1 12:00 C 100
Jul-1 12:10 C 100
Jul-1 12:10 D 30
Jul-1 12:20 C 100
Jul-1 12:20 D 30
Jul-1 12:30 A 10
Jul-1 12:40 E 40
Jul-1 12:50 F 50
Jul-1 1:00 A 40
I am trying to achieve group by date, Time and ids and apply cumulative sum such that if an id is present in the next time-slot the weight is only added once(uniquely). The resulting data frame would look like this
Date Time Weight
Jul-1 12:00 130 (10+20+100)
Jul-1 12:10 160 (10+20+100+30)
Jul-1 12:20 160 (10+20+100+30)
Jul-1 12:30 160 (10+20+100+30)
Jul-1 12:40 200 (10+20+100+30+40)
Jul-1 12:50 250 (10+20+100+30+40+50)
Jul-1 01:00 250 (10+20+100+30+40+50)
This is what I tried below, however this is still counting the weights multiple times:
df=df.groupby(['date','time','ID'])['Wt'].apply(lambda x: x.unique().sum()).reset_index()
df['cumWt']=df['Wt'].cumsum()
Any help would be really appreciated!
Thanks a lot in advance!!
python pandas data-processing
add a comment |
I am very new to python and pandas working on a pandas dataframe which looks like
Date Time ID Weight
Jul-1 12:00 A 10
Jul-1 12:00 B 20
Jul-1 12:00 C 100
Jul-1 12:10 C 100
Jul-1 12:10 D 30
Jul-1 12:20 C 100
Jul-1 12:20 D 30
Jul-1 12:30 A 10
Jul-1 12:40 E 40
Jul-1 12:50 F 50
Jul-1 1:00 A 40
I am trying to achieve group by date, Time and ids and apply cumulative sum such that if an id is present in the next time-slot the weight is only added once(uniquely). The resulting data frame would look like this
Date Time Weight
Jul-1 12:00 130 (10+20+100)
Jul-1 12:10 160 (10+20+100+30)
Jul-1 12:20 160 (10+20+100+30)
Jul-1 12:30 160 (10+20+100+30)
Jul-1 12:40 200 (10+20+100+30+40)
Jul-1 12:50 250 (10+20+100+30+40+50)
Jul-1 01:00 250 (10+20+100+30+40+50)
This is what I tried below, however this is still counting the weights multiple times:
df=df.groupby(['date','time','ID'])['Wt'].apply(lambda x: x.unique().sum()).reset_index()
df['cumWt']=df['Wt'].cumsum()
Any help would be really appreciated!
Thanks a lot in advance!!
python pandas data-processing
check groupby and agg
– Wen-Ben
Mar 7 at 19:59
add a comment |
I am very new to python and pandas working on a pandas dataframe which looks like
Date Time ID Weight
Jul-1 12:00 A 10
Jul-1 12:00 B 20
Jul-1 12:00 C 100
Jul-1 12:10 C 100
Jul-1 12:10 D 30
Jul-1 12:20 C 100
Jul-1 12:20 D 30
Jul-1 12:30 A 10
Jul-1 12:40 E 40
Jul-1 12:50 F 50
Jul-1 1:00 A 40
I am trying to achieve group by date, Time and ids and apply cumulative sum such that if an id is present in the next time-slot the weight is only added once(uniquely). The resulting data frame would look like this
Date Time Weight
Jul-1 12:00 130 (10+20+100)
Jul-1 12:10 160 (10+20+100+30)
Jul-1 12:20 160 (10+20+100+30)
Jul-1 12:30 160 (10+20+100+30)
Jul-1 12:40 200 (10+20+100+30+40)
Jul-1 12:50 250 (10+20+100+30+40+50)
Jul-1 01:00 250 (10+20+100+30+40+50)
This is what I tried below, however this is still counting the weights multiple times:
df=df.groupby(['date','time','ID'])['Wt'].apply(lambda x: x.unique().sum()).reset_index()
df['cumWt']=df['Wt'].cumsum()
Any help would be really appreciated!
Thanks a lot in advance!!
python pandas data-processing
I am very new to python and pandas working on a pandas dataframe which looks like
Date Time ID Weight
Jul-1 12:00 A 10
Jul-1 12:00 B 20
Jul-1 12:00 C 100
Jul-1 12:10 C 100
Jul-1 12:10 D 30
Jul-1 12:20 C 100
Jul-1 12:20 D 30
Jul-1 12:30 A 10
Jul-1 12:40 E 40
Jul-1 12:50 F 50
Jul-1 1:00 A 40
I am trying to achieve group by date, Time and ids and apply cumulative sum such that if an id is present in the next time-slot the weight is only added once(uniquely). The resulting data frame would look like this
Date Time Weight
Jul-1 12:00 130 (10+20+100)
Jul-1 12:10 160 (10+20+100+30)
Jul-1 12:20 160 (10+20+100+30)
Jul-1 12:30 160 (10+20+100+30)
Jul-1 12:40 200 (10+20+100+30+40)
Jul-1 12:50 250 (10+20+100+30+40+50)
Jul-1 01:00 250 (10+20+100+30+40+50)
This is what I tried below, however this is still counting the weights multiple times:
df=df.groupby(['date','time','ID'])['Wt'].apply(lambda x: x.unique().sum()).reset_index()
df['cumWt']=df['Wt'].cumsum()
Any help would be really appreciated!
Thanks a lot in advance!!
python pandas data-processing
python pandas data-processing
edited Mar 7 at 20:14
petezurich
3,76581936
3,76581936
asked Mar 7 at 19:45
AnalyticsTeamAnalyticsTeam
678
678
check groupby and agg
– Wen-Ben
Mar 7 at 19:59
add a comment |
check groupby and agg
– Wen-Ben
Mar 7 at 19:59
check groupby and agg
– Wen-Ben
Mar 7 at 19:59
check groupby and agg
– Wen-Ben
Mar 7 at 19:59
add a comment |
1 Answer
1
active
oldest
votes
The code below uses pandas.duplicate(), pandas.merge(), pandas.groupby/sum and pandas.cumsum() to come to the desired output:
# creates a series of weights to be considered and rename it to merge
unique_weights = df['weight'][~df.duplicated(['weight'])]
unique_weights.rename('consider_cum', inplace = True)
# merges the series to the original dataframe and replace the ignored values by 0
df = df.merge(unique_weights.to_frame(), how = 'left', left_index=True, right_index=True)
df.consider_cum = df.consider_cum.fillna(0)
# sums grouping by date and time
df = df.groupby(['date', 'time']).sum().reset_index()
# create the cumulative sum column and present the output
df['weight_cumsum'] = df['consider_cum'].cumsum()
df[['date', 'time', 'weight_cumsum']]
Produces the following output:
Thanks a lot @Daniel!
– AnalyticsTeam
Mar 7 at 23:37
You are welcome, @AnalyticsTeam :)
– Daniel Labbe
Mar 8 at 8:47
add a comment |
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55051707%2fhow-to-get-cumulative-sum-of-unique-ids-with-group-by%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The code below uses pandas.duplicate(), pandas.merge(), pandas.groupby/sum and pandas.cumsum() to come to the desired output:
# creates a series of weights to be considered and rename it to merge
unique_weights = df['weight'][~df.duplicated(['weight'])]
unique_weights.rename('consider_cum', inplace = True)
# merges the series to the original dataframe and replace the ignored values by 0
df = df.merge(unique_weights.to_frame(), how = 'left', left_index=True, right_index=True)
df.consider_cum = df.consider_cum.fillna(0)
# sums grouping by date and time
df = df.groupby(['date', 'time']).sum().reset_index()
# create the cumulative sum column and present the output
df['weight_cumsum'] = df['consider_cum'].cumsum()
df[['date', 'time', 'weight_cumsum']]
Produces the following output:
Thanks a lot @Daniel!
– AnalyticsTeam
Mar 7 at 23:37
You are welcome, @AnalyticsTeam :)
– Daniel Labbe
Mar 8 at 8:47
add a comment |
The code below uses pandas.duplicate(), pandas.merge(), pandas.groupby/sum and pandas.cumsum() to come to the desired output:
# creates a series of weights to be considered and rename it to merge
unique_weights = df['weight'][~df.duplicated(['weight'])]
unique_weights.rename('consider_cum', inplace = True)
# merges the series to the original dataframe and replace the ignored values by 0
df = df.merge(unique_weights.to_frame(), how = 'left', left_index=True, right_index=True)
df.consider_cum = df.consider_cum.fillna(0)
# sums grouping by date and time
df = df.groupby(['date', 'time']).sum().reset_index()
# create the cumulative sum column and present the output
df['weight_cumsum'] = df['consider_cum'].cumsum()
df[['date', 'time', 'weight_cumsum']]
Produces the following output:
Thanks a lot @Daniel!
– AnalyticsTeam
Mar 7 at 23:37
You are welcome, @AnalyticsTeam :)
– Daniel Labbe
Mar 8 at 8:47
add a comment |
The code below uses pandas.duplicate(), pandas.merge(), pandas.groupby/sum and pandas.cumsum() to come to the desired output:
# creates a series of weights to be considered and rename it to merge
unique_weights = df['weight'][~df.duplicated(['weight'])]
unique_weights.rename('consider_cum', inplace = True)
# merges the series to the original dataframe and replace the ignored values by 0
df = df.merge(unique_weights.to_frame(), how = 'left', left_index=True, right_index=True)
df.consider_cum = df.consider_cum.fillna(0)
# sums grouping by date and time
df = df.groupby(['date', 'time']).sum().reset_index()
# create the cumulative sum column and present the output
df['weight_cumsum'] = df['consider_cum'].cumsum()
df[['date', 'time', 'weight_cumsum']]
Produces the following output:
The code below uses pandas.duplicate(), pandas.merge(), pandas.groupby/sum and pandas.cumsum() to come to the desired output:
# creates a series of weights to be considered and rename it to merge
unique_weights = df['weight'][~df.duplicated(['weight'])]
unique_weights.rename('consider_cum', inplace = True)
# merges the series to the original dataframe and replace the ignored values by 0
df = df.merge(unique_weights.to_frame(), how = 'left', left_index=True, right_index=True)
df.consider_cum = df.consider_cum.fillna(0)
# sums grouping by date and time
df = df.groupby(['date', 'time']).sum().reset_index()
# create the cumulative sum column and present the output
df['weight_cumsum'] = df['consider_cum'].cumsum()
df[['date', 'time', 'weight_cumsum']]
Produces the following output:
edited Mar 7 at 21:01
answered Mar 7 at 20:55
Daniel LabbeDaniel Labbe
1,0241615
1,0241615
Thanks a lot @Daniel!
– AnalyticsTeam
Mar 7 at 23:37
You are welcome, @AnalyticsTeam :)
– Daniel Labbe
Mar 8 at 8:47
add a comment |
Thanks a lot @Daniel!
– AnalyticsTeam
Mar 7 at 23:37
You are welcome, @AnalyticsTeam :)
– Daniel Labbe
Mar 8 at 8:47
Thanks a lot @Daniel!
– AnalyticsTeam
Mar 7 at 23:37
Thanks a lot @Daniel!
– AnalyticsTeam
Mar 7 at 23:37
You are welcome, @AnalyticsTeam :)
– Daniel Labbe
Mar 8 at 8:47
You are welcome, @AnalyticsTeam :)
– Daniel Labbe
Mar 8 at 8:47
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55051707%2fhow-to-get-cumulative-sum-of-unique-ids-with-group-by%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
check groupby and agg
– Wen-Ben
Mar 7 at 19:59