Plot multiple time series events from pandas groupby object into single plotHow to plot stacked event duration (Gantt Charts) using Python Pandas?Plotting labled time series in pandasConverting a Pandas GroupBy object to DataFramePlotting results of Pandas GroupByPlotting with GroupBy in Pandas/PythonTime-series plotting inconsistencies in PandasPlotting Pandas Time Dataplotting pandas intraday time series only for periods with dataPlot elapsed time on x axis using date indexed time-series dataPlot time series by hour in the same plot with Matplotlib and Pandasgenerate series of plots with pandas dataframePlotting multiple panda timedelta series with plotly
Multiplicative persistence
US tourist/student visa
Review your own paper in Mathematics
What does "Scientists rise up against statistical significance" mean? (Comment in Nature)
How much theory knowledge is actually used while playing?
Can I say "fingers" when referring to toes?
Can I cause damage to electrical appliances by unplugging them when they are turned on?
How can I, as DM, avoid the Conga Line of Death occurring when implementing some form of flanking rule?
Does the Linux kernel need a file system to run?
What does Apple's new App Store requirement mean
"It doesn't matter" or "it won't matter"?
Does an advisor owe his/her student anything? Will an advisor keep a PhD student only out of pity?
When were female captains banned from Starfleet?
How can I write humor as character trait?
Which Article Helped Get Rid of Technobabble in RPGs?
Non-trope happy ending?
Has any country ever had 2 former presidents in jail simultaneously?
What is the English pronunciation of "pain au chocolat"?
Giving feedback to someone without sounding prejudiced
Why is so much work done on numerical verification of the Riemann Hypothesis?
PTIJ: Why is Haman obsessed with Bose?
What is going on with gets(stdin) on the site coderbyte?
Does "he squandered his car on drink" sound natural?
What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?
Plot multiple time series events from pandas groupby object into single plot
How to plot stacked event duration (Gantt Charts) using Python Pandas?Plotting labled time series in pandasConverting a Pandas GroupBy object to DataFramePlotting results of Pandas GroupByPlotting with GroupBy in Pandas/PythonTime-series plotting inconsistencies in PandasPlotting Pandas Time Dataplotting pandas intraday time series only for periods with dataPlot elapsed time on x axis using date indexed time-series dataPlot time series by hour in the same plot with Matplotlib and Pandasgenerate series of plots with pandas dataframePlotting multiple panda timedelta series with plotly
I have a time-series related question on how to plot time stamps along a horizontal axis for multiple cases attributed to the same person. Let me explain:
Let us assume we have Jason and Georgia. Both of them work on different cases, which largely have these potential "events": start, pause, resume, end. Many cases just have a "start" and an "end", whereas other also include a pause-resume interval. While one case is paused, the user can work on a different case. I have all this information on a Pandas DataFrame, and I collect the user-and-case-level information doing a groupby
.
Sample data for reproducible code (assuming pandas
and numpy
are imported):
raw_data = 'user': ['Jason', 'Georgia', 'Jason', 'Jason', 'Georgia'],
'case': ['a', 'b', 'c', 'd', 'e'],
'date_picked_up': ['2018-10-25 14:06', '2019-01-25 10:44', '2019-01-25 09:14', '2019-01-25 12:12', '2019-02-21 10:01'],
'date_paused': ['2018-10-26 11:08', '2019-01-25 12:11', np.nan, np.nan, '2019-02-21 12:37'],
'date_resumed': ['2018-10-26 11:20', '2019-01-25 15:21', np.nan, np.nan, '2019-02-21 13:24'],
'date_closed': ['2018-10-29 16:57', '2019-01-25 16:34', '2019-01-25 11:46', '2019-01-25 15:24', '2019-01-25 13:56']
df = pd.DataFrame(raw_data, columns = ['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])
df
This returns df
, a pandas DataFrame with the progression of each case. When we do not have the pause-resume interval, the values are np.nan
. Pandas groupby
automatically ignores nan
values, which we do not want, so in order to deal with this I am using fillna
with a Timestamp
in 1900, after applying to all columns pd.to_datetime
:
date_cols = ['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']
for c in date_cols:
df[c] = pd.to_datetime(df[c], format='%Y%m%d %H:%M')
Now the best way I found to aggregate the data by user and then by case is:
df.fillna(pd.Timestamp('19000101'))
.groupby(['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])[['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']].count()
My goal (from this sample data) is two plots, one for Jason and one for Georgia, where the timestamps (ideally not the 1900 ones) will show up along horizontal "lines", one for each case (on the y axis). The closest example is here: Plotting labled time series in pandas where instead of dogs cats and cows we would have (for Jason) cases a, c, and d on the y-axis.
I have found ideas on how to move everything to bokeh
or d3
for what I really want (e.g.: https://github.com/jiahuang/d3-timeline, How to plot stacked event duration (Gantt Charts) using Python Pandas?), but I hope to find a solution in Python and Matplotlib/Seaborn, since I believe that my data structure is already in a good enough format.
python pandas matplotlib time-series pandas-groupby
add a comment |
I have a time-series related question on how to plot time stamps along a horizontal axis for multiple cases attributed to the same person. Let me explain:
Let us assume we have Jason and Georgia. Both of them work on different cases, which largely have these potential "events": start, pause, resume, end. Many cases just have a "start" and an "end", whereas other also include a pause-resume interval. While one case is paused, the user can work on a different case. I have all this information on a Pandas DataFrame, and I collect the user-and-case-level information doing a groupby
.
Sample data for reproducible code (assuming pandas
and numpy
are imported):
raw_data = 'user': ['Jason', 'Georgia', 'Jason', 'Jason', 'Georgia'],
'case': ['a', 'b', 'c', 'd', 'e'],
'date_picked_up': ['2018-10-25 14:06', '2019-01-25 10:44', '2019-01-25 09:14', '2019-01-25 12:12', '2019-02-21 10:01'],
'date_paused': ['2018-10-26 11:08', '2019-01-25 12:11', np.nan, np.nan, '2019-02-21 12:37'],
'date_resumed': ['2018-10-26 11:20', '2019-01-25 15:21', np.nan, np.nan, '2019-02-21 13:24'],
'date_closed': ['2018-10-29 16:57', '2019-01-25 16:34', '2019-01-25 11:46', '2019-01-25 15:24', '2019-01-25 13:56']
df = pd.DataFrame(raw_data, columns = ['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])
df
This returns df
, a pandas DataFrame with the progression of each case. When we do not have the pause-resume interval, the values are np.nan
. Pandas groupby
automatically ignores nan
values, which we do not want, so in order to deal with this I am using fillna
with a Timestamp
in 1900, after applying to all columns pd.to_datetime
:
date_cols = ['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']
for c in date_cols:
df[c] = pd.to_datetime(df[c], format='%Y%m%d %H:%M')
Now the best way I found to aggregate the data by user and then by case is:
df.fillna(pd.Timestamp('19000101'))
.groupby(['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])[['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']].count()
My goal (from this sample data) is two plots, one for Jason and one for Georgia, where the timestamps (ideally not the 1900 ones) will show up along horizontal "lines", one for each case (on the y axis). The closest example is here: Plotting labled time series in pandas where instead of dogs cats and cows we would have (for Jason) cases a, c, and d on the y-axis.
I have found ideas on how to move everything to bokeh
or d3
for what I really want (e.g.: https://github.com/jiahuang/d3-timeline, How to plot stacked event duration (Gantt Charts) using Python Pandas?), but I hope to find a solution in Python and Matplotlib/Seaborn, since I believe that my data structure is already in a good enough format.
python pandas matplotlib time-series pandas-groupby
add a comment |
I have a time-series related question on how to plot time stamps along a horizontal axis for multiple cases attributed to the same person. Let me explain:
Let us assume we have Jason and Georgia. Both of them work on different cases, which largely have these potential "events": start, pause, resume, end. Many cases just have a "start" and an "end", whereas other also include a pause-resume interval. While one case is paused, the user can work on a different case. I have all this information on a Pandas DataFrame, and I collect the user-and-case-level information doing a groupby
.
Sample data for reproducible code (assuming pandas
and numpy
are imported):
raw_data = 'user': ['Jason', 'Georgia', 'Jason', 'Jason', 'Georgia'],
'case': ['a', 'b', 'c', 'd', 'e'],
'date_picked_up': ['2018-10-25 14:06', '2019-01-25 10:44', '2019-01-25 09:14', '2019-01-25 12:12', '2019-02-21 10:01'],
'date_paused': ['2018-10-26 11:08', '2019-01-25 12:11', np.nan, np.nan, '2019-02-21 12:37'],
'date_resumed': ['2018-10-26 11:20', '2019-01-25 15:21', np.nan, np.nan, '2019-02-21 13:24'],
'date_closed': ['2018-10-29 16:57', '2019-01-25 16:34', '2019-01-25 11:46', '2019-01-25 15:24', '2019-01-25 13:56']
df = pd.DataFrame(raw_data, columns = ['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])
df
This returns df
, a pandas DataFrame with the progression of each case. When we do not have the pause-resume interval, the values are np.nan
. Pandas groupby
automatically ignores nan
values, which we do not want, so in order to deal with this I am using fillna
with a Timestamp
in 1900, after applying to all columns pd.to_datetime
:
date_cols = ['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']
for c in date_cols:
df[c] = pd.to_datetime(df[c], format='%Y%m%d %H:%M')
Now the best way I found to aggregate the data by user and then by case is:
df.fillna(pd.Timestamp('19000101'))
.groupby(['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])[['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']].count()
My goal (from this sample data) is two plots, one for Jason and one for Georgia, where the timestamps (ideally not the 1900 ones) will show up along horizontal "lines", one for each case (on the y axis). The closest example is here: Plotting labled time series in pandas where instead of dogs cats and cows we would have (for Jason) cases a, c, and d on the y-axis.
I have found ideas on how to move everything to bokeh
or d3
for what I really want (e.g.: https://github.com/jiahuang/d3-timeline, How to plot stacked event duration (Gantt Charts) using Python Pandas?), but I hope to find a solution in Python and Matplotlib/Seaborn, since I believe that my data structure is already in a good enough format.
python pandas matplotlib time-series pandas-groupby
I have a time-series related question on how to plot time stamps along a horizontal axis for multiple cases attributed to the same person. Let me explain:
Let us assume we have Jason and Georgia. Both of them work on different cases, which largely have these potential "events": start, pause, resume, end. Many cases just have a "start" and an "end", whereas other also include a pause-resume interval. While one case is paused, the user can work on a different case. I have all this information on a Pandas DataFrame, and I collect the user-and-case-level information doing a groupby
.
Sample data for reproducible code (assuming pandas
and numpy
are imported):
raw_data = 'user': ['Jason', 'Georgia', 'Jason', 'Jason', 'Georgia'],
'case': ['a', 'b', 'c', 'd', 'e'],
'date_picked_up': ['2018-10-25 14:06', '2019-01-25 10:44', '2019-01-25 09:14', '2019-01-25 12:12', '2019-02-21 10:01'],
'date_paused': ['2018-10-26 11:08', '2019-01-25 12:11', np.nan, np.nan, '2019-02-21 12:37'],
'date_resumed': ['2018-10-26 11:20', '2019-01-25 15:21', np.nan, np.nan, '2019-02-21 13:24'],
'date_closed': ['2018-10-29 16:57', '2019-01-25 16:34', '2019-01-25 11:46', '2019-01-25 15:24', '2019-01-25 13:56']
df = pd.DataFrame(raw_data, columns = ['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])
df
This returns df
, a pandas DataFrame with the progression of each case. When we do not have the pause-resume interval, the values are np.nan
. Pandas groupby
automatically ignores nan
values, which we do not want, so in order to deal with this I am using fillna
with a Timestamp
in 1900, after applying to all columns pd.to_datetime
:
date_cols = ['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']
for c in date_cols:
df[c] = pd.to_datetime(df[c], format='%Y%m%d %H:%M')
Now the best way I found to aggregate the data by user and then by case is:
df.fillna(pd.Timestamp('19000101'))
.groupby(['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])[['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']].count()
My goal (from this sample data) is two plots, one for Jason and one for Georgia, where the timestamps (ideally not the 1900 ones) will show up along horizontal "lines", one for each case (on the y axis). The closest example is here: Plotting labled time series in pandas where instead of dogs cats and cows we would have (for Jason) cases a, c, and d on the y-axis.
I have found ideas on how to move everything to bokeh
or d3
for what I really want (e.g.: https://github.com/jiahuang/d3-timeline, How to plot stacked event duration (Gantt Charts) using Python Pandas?), but I hope to find a solution in Python and Matplotlib/Seaborn, since I believe that my data structure is already in a good enough format.
python pandas matplotlib time-series pandas-groupby
python pandas matplotlib time-series pandas-groupby
asked Mar 7 at 4:48
nvergosnvergos
829
829
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55036282%2fplot-multiple-time-series-events-from-pandas-groupby-object-into-single-plot%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55036282%2fplot-multiple-time-series-events-from-pandas-groupby-object-into-single-plot%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown