Plot multiple time series events from pandas groupby object into single plotHow to plot stacked event duration (Gantt Charts) using Python Pandas?Plotting labled time series in pandasConverting a Pandas GroupBy object to DataFramePlotting results of Pandas GroupByPlotting with GroupBy in Pandas/PythonTime-series plotting inconsistencies in PandasPlotting Pandas Time Dataplotting pandas intraday time series only for periods with dataPlot elapsed time on x axis using date indexed time-series dataPlot time series by hour in the same plot with Matplotlib and Pandasgenerate series of plots with pandas dataframePlotting multiple panda timedelta series with plotly

Multiplicative persistence

US tourist/student visa

Review your own paper in Mathematics

What does "Scientists rise up against statistical significance" mean? (Comment in Nature)

How much theory knowledge is actually used while playing?

Can I say "fingers" when referring to toes?

Can I cause damage to electrical appliances by unplugging them when they are turned on?

How can I, as DM, avoid the Conga Line of Death occurring when implementing some form of flanking rule?

Does the Linux kernel need a file system to run?

What does Apple's new App Store requirement mean

"It doesn't matter" or "it won't matter"?

Does an advisor owe his/her student anything? Will an advisor keep a PhD student only out of pity?

When were female captains banned from Starfleet?

How can I write humor as character trait?

Which Article Helped Get Rid of Technobabble in RPGs?

Non-trope happy ending?

Has any country ever had 2 former presidents in jail simultaneously?

What is the English pronunciation of "pain au chocolat"?

Giving feedback to someone without sounding prejudiced

Why is so much work done on numerical verification of the Riemann Hypothesis?

PTIJ: Why is Haman obsessed with Bose?

What is going on with gets(stdin) on the site coderbyte?

Does "he squandered his car on drink" sound natural?

What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?



Plot multiple time series events from pandas groupby object into single plot


How to plot stacked event duration (Gantt Charts) using Python Pandas?Plotting labled time series in pandasConverting a Pandas GroupBy object to DataFramePlotting results of Pandas GroupByPlotting with GroupBy in Pandas/PythonTime-series plotting inconsistencies in PandasPlotting Pandas Time Dataplotting pandas intraday time series only for periods with dataPlot elapsed time on x axis using date indexed time-series dataPlot time series by hour in the same plot with Matplotlib and Pandasgenerate series of plots with pandas dataframePlotting multiple panda timedelta series with plotly













0















I have a time-series related question on how to plot time stamps along a horizontal axis for multiple cases attributed to the same person. Let me explain:



Let us assume we have Jason and Georgia. Both of them work on different cases, which largely have these potential "events": start, pause, resume, end. Many cases just have a "start" and an "end", whereas other also include a pause-resume interval. While one case is paused, the user can work on a different case. I have all this information on a Pandas DataFrame, and I collect the user-and-case-level information doing a groupby.



Sample data for reproducible code (assuming pandas and numpy are imported):



raw_data = 'user': ['Jason', 'Georgia', 'Jason', 'Jason', 'Georgia'], 
'case': ['a', 'b', 'c', 'd', 'e'],
'date_picked_up': ['2018-10-25 14:06', '2019-01-25 10:44', '2019-01-25 09:14', '2019-01-25 12:12', '2019-02-21 10:01'],
'date_paused': ['2018-10-26 11:08', '2019-01-25 12:11', np.nan, np.nan, '2019-02-21 12:37'],
'date_resumed': ['2018-10-26 11:20', '2019-01-25 15:21', np.nan, np.nan, '2019-02-21 13:24'],
'date_closed': ['2018-10-29 16:57', '2019-01-25 16:34', '2019-01-25 11:46', '2019-01-25 15:24', '2019-01-25 13:56']
df = pd.DataFrame(raw_data, columns = ['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])
df


This returns df, a pandas DataFrame with the progression of each case. When we do not have the pause-resume interval, the values are np.nan. Pandas groupby automatically ignores nan values, which we do not want, so in order to deal with this I am using fillna with a Timestamp in 1900, after applying to all columns pd.to_datetime:



date_cols = ['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']
for c in date_cols:
df[c] = pd.to_datetime(df[c], format='%Y%m%d %H:%M')


Now the best way I found to aggregate the data by user and then by case is:



df.fillna(pd.Timestamp('19000101'))
.groupby(['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])[['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']].count()


My goal (from this sample data) is two plots, one for Jason and one for Georgia, where the timestamps (ideally not the 1900 ones) will show up along horizontal "lines", one for each case (on the y axis). The closest example is here: Plotting labled time series in pandas where instead of dogs cats and cows we would have (for Jason) cases a, c, and d on the y-axis.



I have found ideas on how to move everything to bokeh or d3 for what I really want (e.g.: https://github.com/jiahuang/d3-timeline, How to plot stacked event duration (Gantt Charts) using Python Pandas?), but I hope to find a solution in Python and Matplotlib/Seaborn, since I believe that my data structure is already in a good enough format.










share|improve this question


























    0















    I have a time-series related question on how to plot time stamps along a horizontal axis for multiple cases attributed to the same person. Let me explain:



    Let us assume we have Jason and Georgia. Both of them work on different cases, which largely have these potential "events": start, pause, resume, end. Many cases just have a "start" and an "end", whereas other also include a pause-resume interval. While one case is paused, the user can work on a different case. I have all this information on a Pandas DataFrame, and I collect the user-and-case-level information doing a groupby.



    Sample data for reproducible code (assuming pandas and numpy are imported):



    raw_data = 'user': ['Jason', 'Georgia', 'Jason', 'Jason', 'Georgia'], 
    'case': ['a', 'b', 'c', 'd', 'e'],
    'date_picked_up': ['2018-10-25 14:06', '2019-01-25 10:44', '2019-01-25 09:14', '2019-01-25 12:12', '2019-02-21 10:01'],
    'date_paused': ['2018-10-26 11:08', '2019-01-25 12:11', np.nan, np.nan, '2019-02-21 12:37'],
    'date_resumed': ['2018-10-26 11:20', '2019-01-25 15:21', np.nan, np.nan, '2019-02-21 13:24'],
    'date_closed': ['2018-10-29 16:57', '2019-01-25 16:34', '2019-01-25 11:46', '2019-01-25 15:24', '2019-01-25 13:56']
    df = pd.DataFrame(raw_data, columns = ['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])
    df


    This returns df, a pandas DataFrame with the progression of each case. When we do not have the pause-resume interval, the values are np.nan. Pandas groupby automatically ignores nan values, which we do not want, so in order to deal with this I am using fillna with a Timestamp in 1900, after applying to all columns pd.to_datetime:



    date_cols = ['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']
    for c in date_cols:
    df[c] = pd.to_datetime(df[c], format='%Y%m%d %H:%M')


    Now the best way I found to aggregate the data by user and then by case is:



    df.fillna(pd.Timestamp('19000101'))
    .groupby(['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])[['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']].count()


    My goal (from this sample data) is two plots, one for Jason and one for Georgia, where the timestamps (ideally not the 1900 ones) will show up along horizontal "lines", one for each case (on the y axis). The closest example is here: Plotting labled time series in pandas where instead of dogs cats and cows we would have (for Jason) cases a, c, and d on the y-axis.



    I have found ideas on how to move everything to bokeh or d3 for what I really want (e.g.: https://github.com/jiahuang/d3-timeline, How to plot stacked event duration (Gantt Charts) using Python Pandas?), but I hope to find a solution in Python and Matplotlib/Seaborn, since I believe that my data structure is already in a good enough format.










    share|improve this question
























      0












      0








      0








      I have a time-series related question on how to plot time stamps along a horizontal axis for multiple cases attributed to the same person. Let me explain:



      Let us assume we have Jason and Georgia. Both of them work on different cases, which largely have these potential "events": start, pause, resume, end. Many cases just have a "start" and an "end", whereas other also include a pause-resume interval. While one case is paused, the user can work on a different case. I have all this information on a Pandas DataFrame, and I collect the user-and-case-level information doing a groupby.



      Sample data for reproducible code (assuming pandas and numpy are imported):



      raw_data = 'user': ['Jason', 'Georgia', 'Jason', 'Jason', 'Georgia'], 
      'case': ['a', 'b', 'c', 'd', 'e'],
      'date_picked_up': ['2018-10-25 14:06', '2019-01-25 10:44', '2019-01-25 09:14', '2019-01-25 12:12', '2019-02-21 10:01'],
      'date_paused': ['2018-10-26 11:08', '2019-01-25 12:11', np.nan, np.nan, '2019-02-21 12:37'],
      'date_resumed': ['2018-10-26 11:20', '2019-01-25 15:21', np.nan, np.nan, '2019-02-21 13:24'],
      'date_closed': ['2018-10-29 16:57', '2019-01-25 16:34', '2019-01-25 11:46', '2019-01-25 15:24', '2019-01-25 13:56']
      df = pd.DataFrame(raw_data, columns = ['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])
      df


      This returns df, a pandas DataFrame with the progression of each case. When we do not have the pause-resume interval, the values are np.nan. Pandas groupby automatically ignores nan values, which we do not want, so in order to deal with this I am using fillna with a Timestamp in 1900, after applying to all columns pd.to_datetime:



      date_cols = ['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']
      for c in date_cols:
      df[c] = pd.to_datetime(df[c], format='%Y%m%d %H:%M')


      Now the best way I found to aggregate the data by user and then by case is:



      df.fillna(pd.Timestamp('19000101'))
      .groupby(['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])[['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']].count()


      My goal (from this sample data) is two plots, one for Jason and one for Georgia, where the timestamps (ideally not the 1900 ones) will show up along horizontal "lines", one for each case (on the y axis). The closest example is here: Plotting labled time series in pandas where instead of dogs cats and cows we would have (for Jason) cases a, c, and d on the y-axis.



      I have found ideas on how to move everything to bokeh or d3 for what I really want (e.g.: https://github.com/jiahuang/d3-timeline, How to plot stacked event duration (Gantt Charts) using Python Pandas?), but I hope to find a solution in Python and Matplotlib/Seaborn, since I believe that my data structure is already in a good enough format.










      share|improve this question














      I have a time-series related question on how to plot time stamps along a horizontal axis for multiple cases attributed to the same person. Let me explain:



      Let us assume we have Jason and Georgia. Both of them work on different cases, which largely have these potential "events": start, pause, resume, end. Many cases just have a "start" and an "end", whereas other also include a pause-resume interval. While one case is paused, the user can work on a different case. I have all this information on a Pandas DataFrame, and I collect the user-and-case-level information doing a groupby.



      Sample data for reproducible code (assuming pandas and numpy are imported):



      raw_data = 'user': ['Jason', 'Georgia', 'Jason', 'Jason', 'Georgia'], 
      'case': ['a', 'b', 'c', 'd', 'e'],
      'date_picked_up': ['2018-10-25 14:06', '2019-01-25 10:44', '2019-01-25 09:14', '2019-01-25 12:12', '2019-02-21 10:01'],
      'date_paused': ['2018-10-26 11:08', '2019-01-25 12:11', np.nan, np.nan, '2019-02-21 12:37'],
      'date_resumed': ['2018-10-26 11:20', '2019-01-25 15:21', np.nan, np.nan, '2019-02-21 13:24'],
      'date_closed': ['2018-10-29 16:57', '2019-01-25 16:34', '2019-01-25 11:46', '2019-01-25 15:24', '2019-01-25 13:56']
      df = pd.DataFrame(raw_data, columns = ['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])
      df


      This returns df, a pandas DataFrame with the progression of each case. When we do not have the pause-resume interval, the values are np.nan. Pandas groupby automatically ignores nan values, which we do not want, so in order to deal with this I am using fillna with a Timestamp in 1900, after applying to all columns pd.to_datetime:



      date_cols = ['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']
      for c in date_cols:
      df[c] = pd.to_datetime(df[c], format='%Y%m%d %H:%M')


      Now the best way I found to aggregate the data by user and then by case is:



      df.fillna(pd.Timestamp('19000101'))
      .groupby(['user', 'case', 'date_picked_up', 'date_paused', 'date_resumed', 'date_closed'])[['date_picked_up', 'date_paused', 'date_resumed', 'date_closed']].count()


      My goal (from this sample data) is two plots, one for Jason and one for Georgia, where the timestamps (ideally not the 1900 ones) will show up along horizontal "lines", one for each case (on the y axis). The closest example is here: Plotting labled time series in pandas where instead of dogs cats and cows we would have (for Jason) cases a, c, and d on the y-axis.



      I have found ideas on how to move everything to bokeh or d3 for what I really want (e.g.: https://github.com/jiahuang/d3-timeline, How to plot stacked event duration (Gantt Charts) using Python Pandas?), but I hope to find a solution in Python and Matplotlib/Seaborn, since I believe that my data structure is already in a good enough format.







      python pandas matplotlib time-series pandas-groupby






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 7 at 4:48









      nvergosnvergos

      829




      829






















          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55036282%2fplot-multiple-time-series-events-from-pandas-groupby-object-into-single-plot%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55036282%2fplot-multiple-time-series-events-from-pandas-groupby-object-into-single-plot%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          1928 у кіно

          Захаров Федір Захарович

          Ель Греко