Feature tools - temporal cutoffs do not register time index variable2019 Community Moderator Election“Large data” work flows using pandasPython, Pandas: Reindex/Slice DataFrame with duplicate Index valuescutoff time and training window at featuretoolsCreate features based on cutoff times in featuretoolsLookupError: Time index not found in dataframeRemove Rows Where the Person Has Not Changed LocationsFeaturetool deployment issueAre there built-in primitives for interactions in Feature tools?Automated feature generation for time series problems - FeaturetoolsFeature Tools default cutoff_time

I've given my players a lot of magic items. Is it reasonable for me to give them harder encounters?

Should I apply for my boss's promotion?

Why is there an extra space when I type "ls" on the Desktop?

Why restrict private health insurance?

Should I file my taxes? No income, unemployed, but paid 2k in student loan interest

What would be the most expensive material to an intergalactic society?

How to educate team mate to take screenshots for bugs with out unwanted stuff

Can multiple states demand income tax from an LLC?

std::string vs const std::string& vs std::string_view

What is Tony Stark injecting into himself in Iron Man 3?

Use Mercury as quenching liquid for swords?

What the error in writing this equation by latex?

Short story about cities being connected by a conveyor belt

After Brexit, will the EU recognize British passports that are valid for more than ten years?

Too soon for a plot twist?

Can I challenge the interviewer to give me a proper technical feedback?

Is there a logarithm base for which the logarithm becomes an identity function?

Why isn't P and P/poly trivially the same?

Is there a math expression equivalent to the conditional ternary operator?

Professor forcing me to attend a conference, I can't afford even with 50% funding

Does an unused member variable take up memory?

What is the best index strategy or query SELECT when performing a search/lookup BETWEEN IP address (IPv4 and IPv6) ranges?

Why does this boat have a landing pad? (SpaceX's GO Searcher) Any plans for propulsive capsule landings?

ESPP--any reason not to go all in?



Feature tools - temporal cutoffs do not register time index variable



2019 Community Moderator Election“Large data” work flows using pandasPython, Pandas: Reindex/Slice DataFrame with duplicate Index valuescutoff time and training window at featuretoolsCreate features based on cutoff times in featuretoolsLookupError: Time index not found in dataframeRemove Rows Where the Person Has Not Changed LocationsFeaturetool deployment issueAre there built-in primitives for interactions in Feature tools?Automated feature generation for time series problems - FeaturetoolsFeature Tools default cutoff_time










0















I am using feature tools to create monthly aggregations.



I have toy data consisting of loan applications (1000 ID_APPLICATION; 1000 TIME_APPLICATION)
and 200k transactions (-> ~200 transactions for 1 person; 1 transaction have information like AMOUNT, TIME and other, not needed for this example). TIME column consists of ~200 different times for one person, in previous year or more.



constants.py
____________
ID_APPLICATION_COLUMN = "ID_APPLICATION"
ID_TRANSACTIONS_COLUMN = "ID_TRANSACTION"
TIME_COLUMN = "TIME"
TIME_APPLICATION_COLUMN = "TIME_APPLICATION"
ENTITY_SET_NAME = "clients"
TRANSACTIONS_ENTITY_NAME = "transactions"
APPLICATIONS_ENTITY_NAME = "applications"


creation
____________
# we fill the entity_set with the dataframes, and say, which IDs are relevant for given DF
entity_set.entity_from_dataframe(entity_id=cnst.TRANSACTIONS_ENTITY_NAME,
dataframe=transactions,
index=cnst.ID_TRANSACTIONS_COLUMN,
time_index=cnst.TIME_COLUMN)
entity_set.entity_from_dataframe(entity_id=cnst.APPLICATIONS_ENTITY_NAME,
dataframe=applications,
index=cnst.ID_APPLICATION_COLUMN,
time_index=cnst.TIME_APPLICATION_COLUMN)

# Specification of the relationship between entities
r_transactions_applications = ft.Relationship(
parent_variable=entity_set[cnst.APPLICATIONS_ENTITY_NAME][cnst.ID_APPLICATION_COLUMN],
child_variable=entity_set[cnst.TRANSACTIONS_ENTITY_NAME][cnst.ID_APPLICATION_COLUMN])
entity_set.add_relationship(r_transactions_applications)



However, I have problem with temporal cutoffs.



when I create them and apply them:



default_agg_primitives = ["count", "sum", "std", "max", "mode", "mean"]
default_trans_primitives = ['month', 'day', 'time_since_previous']
temporal_cutoffs = ft.make_temporal_cutoffs(
instance_ids=applications[cnst.ID_APPLICATION_COLUMN],
cutoffs=applications[cnst.TIME_APPLICATION_COLUMN],
window_size='1m',
num_windows=6)
transformed_data = ft.dfs(entityset=entity_set,
target_entity=cnst.APPLICATIONS_ENTITY_NAME,
cutoff_time=temporal_cutoffs,
cutoff_time_in_index=True,
trans_primitives=default_trans_primitives,
agg_primitives=default_agg_primitives,
max_depth=2)


As I am aggregating for the application level, I get 1000 rows without temporal cutoffs. What I get when I apply them is 6000 rows, however 5000 rows (all months before, except for the last one) are 0 or NaN, and the rest is just the same as if I would not be using temporal cutoffs at all.



To me, it seems that the TIME column is not registered and the dataset is not splitted.



Where can I set this?










share|improve this question


























    0















    I am using feature tools to create monthly aggregations.



    I have toy data consisting of loan applications (1000 ID_APPLICATION; 1000 TIME_APPLICATION)
    and 200k transactions (-> ~200 transactions for 1 person; 1 transaction have information like AMOUNT, TIME and other, not needed for this example). TIME column consists of ~200 different times for one person, in previous year or more.



    constants.py
    ____________
    ID_APPLICATION_COLUMN = "ID_APPLICATION"
    ID_TRANSACTIONS_COLUMN = "ID_TRANSACTION"
    TIME_COLUMN = "TIME"
    TIME_APPLICATION_COLUMN = "TIME_APPLICATION"
    ENTITY_SET_NAME = "clients"
    TRANSACTIONS_ENTITY_NAME = "transactions"
    APPLICATIONS_ENTITY_NAME = "applications"


    creation
    ____________
    # we fill the entity_set with the dataframes, and say, which IDs are relevant for given DF
    entity_set.entity_from_dataframe(entity_id=cnst.TRANSACTIONS_ENTITY_NAME,
    dataframe=transactions,
    index=cnst.ID_TRANSACTIONS_COLUMN,
    time_index=cnst.TIME_COLUMN)
    entity_set.entity_from_dataframe(entity_id=cnst.APPLICATIONS_ENTITY_NAME,
    dataframe=applications,
    index=cnst.ID_APPLICATION_COLUMN,
    time_index=cnst.TIME_APPLICATION_COLUMN)

    # Specification of the relationship between entities
    r_transactions_applications = ft.Relationship(
    parent_variable=entity_set[cnst.APPLICATIONS_ENTITY_NAME][cnst.ID_APPLICATION_COLUMN],
    child_variable=entity_set[cnst.TRANSACTIONS_ENTITY_NAME][cnst.ID_APPLICATION_COLUMN])
    entity_set.add_relationship(r_transactions_applications)



    However, I have problem with temporal cutoffs.



    when I create them and apply them:



    default_agg_primitives = ["count", "sum", "std", "max", "mode", "mean"]
    default_trans_primitives = ['month', 'day', 'time_since_previous']
    temporal_cutoffs = ft.make_temporal_cutoffs(
    instance_ids=applications[cnst.ID_APPLICATION_COLUMN],
    cutoffs=applications[cnst.TIME_APPLICATION_COLUMN],
    window_size='1m',
    num_windows=6)
    transformed_data = ft.dfs(entityset=entity_set,
    target_entity=cnst.APPLICATIONS_ENTITY_NAME,
    cutoff_time=temporal_cutoffs,
    cutoff_time_in_index=True,
    trans_primitives=default_trans_primitives,
    agg_primitives=default_agg_primitives,
    max_depth=2)


    As I am aggregating for the application level, I get 1000 rows without temporal cutoffs. What I get when I apply them is 6000 rows, however 5000 rows (all months before, except for the last one) are 0 or NaN, and the rest is just the same as if I would not be using temporal cutoffs at all.



    To me, it seems that the TIME column is not registered and the dataset is not splitted.



    Where can I set this?










    share|improve this question
























      0












      0








      0








      I am using feature tools to create monthly aggregations.



      I have toy data consisting of loan applications (1000 ID_APPLICATION; 1000 TIME_APPLICATION)
      and 200k transactions (-> ~200 transactions for 1 person; 1 transaction have information like AMOUNT, TIME and other, not needed for this example). TIME column consists of ~200 different times for one person, in previous year or more.



      constants.py
      ____________
      ID_APPLICATION_COLUMN = "ID_APPLICATION"
      ID_TRANSACTIONS_COLUMN = "ID_TRANSACTION"
      TIME_COLUMN = "TIME"
      TIME_APPLICATION_COLUMN = "TIME_APPLICATION"
      ENTITY_SET_NAME = "clients"
      TRANSACTIONS_ENTITY_NAME = "transactions"
      APPLICATIONS_ENTITY_NAME = "applications"


      creation
      ____________
      # we fill the entity_set with the dataframes, and say, which IDs are relevant for given DF
      entity_set.entity_from_dataframe(entity_id=cnst.TRANSACTIONS_ENTITY_NAME,
      dataframe=transactions,
      index=cnst.ID_TRANSACTIONS_COLUMN,
      time_index=cnst.TIME_COLUMN)
      entity_set.entity_from_dataframe(entity_id=cnst.APPLICATIONS_ENTITY_NAME,
      dataframe=applications,
      index=cnst.ID_APPLICATION_COLUMN,
      time_index=cnst.TIME_APPLICATION_COLUMN)

      # Specification of the relationship between entities
      r_transactions_applications = ft.Relationship(
      parent_variable=entity_set[cnst.APPLICATIONS_ENTITY_NAME][cnst.ID_APPLICATION_COLUMN],
      child_variable=entity_set[cnst.TRANSACTIONS_ENTITY_NAME][cnst.ID_APPLICATION_COLUMN])
      entity_set.add_relationship(r_transactions_applications)



      However, I have problem with temporal cutoffs.



      when I create them and apply them:



      default_agg_primitives = ["count", "sum", "std", "max", "mode", "mean"]
      default_trans_primitives = ['month', 'day', 'time_since_previous']
      temporal_cutoffs = ft.make_temporal_cutoffs(
      instance_ids=applications[cnst.ID_APPLICATION_COLUMN],
      cutoffs=applications[cnst.TIME_APPLICATION_COLUMN],
      window_size='1m',
      num_windows=6)
      transformed_data = ft.dfs(entityset=entity_set,
      target_entity=cnst.APPLICATIONS_ENTITY_NAME,
      cutoff_time=temporal_cutoffs,
      cutoff_time_in_index=True,
      trans_primitives=default_trans_primitives,
      agg_primitives=default_agg_primitives,
      max_depth=2)


      As I am aggregating for the application level, I get 1000 rows without temporal cutoffs. What I get when I apply them is 6000 rows, however 5000 rows (all months before, except for the last one) are 0 or NaN, and the rest is just the same as if I would not be using temporal cutoffs at all.



      To me, it seems that the TIME column is not registered and the dataset is not splitted.



      Where can I set this?










      share|improve this question














      I am using feature tools to create monthly aggregations.



      I have toy data consisting of loan applications (1000 ID_APPLICATION; 1000 TIME_APPLICATION)
      and 200k transactions (-> ~200 transactions for 1 person; 1 transaction have information like AMOUNT, TIME and other, not needed for this example). TIME column consists of ~200 different times for one person, in previous year or more.



      constants.py
      ____________
      ID_APPLICATION_COLUMN = "ID_APPLICATION"
      ID_TRANSACTIONS_COLUMN = "ID_TRANSACTION"
      TIME_COLUMN = "TIME"
      TIME_APPLICATION_COLUMN = "TIME_APPLICATION"
      ENTITY_SET_NAME = "clients"
      TRANSACTIONS_ENTITY_NAME = "transactions"
      APPLICATIONS_ENTITY_NAME = "applications"


      creation
      ____________
      # we fill the entity_set with the dataframes, and say, which IDs are relevant for given DF
      entity_set.entity_from_dataframe(entity_id=cnst.TRANSACTIONS_ENTITY_NAME,
      dataframe=transactions,
      index=cnst.ID_TRANSACTIONS_COLUMN,
      time_index=cnst.TIME_COLUMN)
      entity_set.entity_from_dataframe(entity_id=cnst.APPLICATIONS_ENTITY_NAME,
      dataframe=applications,
      index=cnst.ID_APPLICATION_COLUMN,
      time_index=cnst.TIME_APPLICATION_COLUMN)

      # Specification of the relationship between entities
      r_transactions_applications = ft.Relationship(
      parent_variable=entity_set[cnst.APPLICATIONS_ENTITY_NAME][cnst.ID_APPLICATION_COLUMN],
      child_variable=entity_set[cnst.TRANSACTIONS_ENTITY_NAME][cnst.ID_APPLICATION_COLUMN])
      entity_set.add_relationship(r_transactions_applications)



      However, I have problem with temporal cutoffs.



      when I create them and apply them:



      default_agg_primitives = ["count", "sum", "std", "max", "mode", "mean"]
      default_trans_primitives = ['month', 'day', 'time_since_previous']
      temporal_cutoffs = ft.make_temporal_cutoffs(
      instance_ids=applications[cnst.ID_APPLICATION_COLUMN],
      cutoffs=applications[cnst.TIME_APPLICATION_COLUMN],
      window_size='1m',
      num_windows=6)
      transformed_data = ft.dfs(entityset=entity_set,
      target_entity=cnst.APPLICATIONS_ENTITY_NAME,
      cutoff_time=temporal_cutoffs,
      cutoff_time_in_index=True,
      trans_primitives=default_trans_primitives,
      agg_primitives=default_agg_primitives,
      max_depth=2)


      As I am aggregating for the application level, I get 1000 rows without temporal cutoffs. What I get when I apply them is 6000 rows, however 5000 rows (all months before, except for the last one) are 0 or NaN, and the rest is just the same as if I would not be using temporal cutoffs at all.



      To me, it seems that the TIME column is not registered and the dataset is not splitted.



      Where can I set this?







      python feature-extraction featuretools






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 2 days ago









      johnnyheinekenjohnnyheineken

      156114




      156114






















          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55023395%2ffeature-tools-temporal-cutoffs-do-not-register-time-index-variable%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55023395%2ffeature-tools-temporal-cutoffs-do-not-register-time-index-variable%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Save data to MySQL database using ExtJS and PHP [closed]2019 Community Moderator ElectionHow can I prevent SQL injection in PHP?Which MySQL data type to use for storing boolean valuesPHP: Delete an element from an arrayHow do I connect to a MySQL Database in Python?Should I use the datetime or timestamp data type in MySQL?How to get a list of MySQL user accountsHow Do You Parse and Process HTML/XML in PHP?Reference — What does this symbol mean in PHP?How does PHP 'foreach' actually work?Why shouldn't I use mysql_* functions in PHP?

          Compiling GNU Global with universal-ctags support Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Tags for Emacs: Relationship between etags, ebrowse, cscope, GNU Global and exuberant ctagsVim and Ctags tips and trickscscope or ctags why choose one over the other?scons and ctagsctags cannot open option file “.ctags”Adding tag scopes in universal-ctagsShould I use Universal-ctags?Universal ctags on WindowsHow do I install GNU Global with universal ctags support using Homebrew?Universal ctags with emacsHow to highlight ctags generated by Universal Ctags in Vim?

          Add ONERROR event to image from jsp tldHow to add an image to a JPanel?Saving image from PHP URLHTML img scalingCheck if an image is loaded (no errors) with jQueryHow to force an <img> to take up width, even if the image is not loadedHow do I populate hidden form field with a value set in Spring ControllerStyling Raw elements Generated from JSP tagds with Jquery MobileLimit resizing of images with explicitly set width and height attributeserror TLD use in a jsp fileJsp tld files cannot be resolved