How to combine the two rows of a dataset into a single row in spark using javaHow do I efficiently iterate over each entry in a Java Map?How can I concatenate two arrays in Java?How do I call one constructor from another in Java?How do I read / convert an InputStream into a String in Java?How do I generate random integers within a specific range in Java?How to get an enum value from a string value in Java?How do I determine whether an array contains a particular value in Java?How do I declare and initialize an array in Java?How to split a string in JavaHow do I convert a String to an int in Java?

Today is the Center

Can I make popcorn with any corn?

Why doesn't H₄O²⁺ exist?

Can a Warlock become Neutral Good?

Is it tax fraud for an individual to declare non-taxable revenue as taxable income? (US tax laws)

What are these boxed doors outside store fronts in New York?

Is a tag line useful on a cover?

What's the point of deactivating Num Lock on login screens?

Dragon forelimb placement

Why was the small council so happy for Tyrion to become the Master of Coin?

Maximum likelihood parameters deviate from posterior distributions

Modeling an IPv4 Address

What's the output of a record cartridge playing an out-of-speed record

Is it possible to do 50 km distance without any previous training?

What do you call a Matrix-like slowdown and camera movement effect?

What is the word for reserving something for yourself before others do?

Mathematical cryptic clues

Show that if two triangles built on parallel lines, with equal bases have the same perimeter only if they are congruent.

Accidentally leaked the solution to an assignment, what to do now? (I'm the prof)

Python: next in for loop

How is the claim "I am in New York only if I am in America" the same as "If I am in New York, then I am in America?

How can I make my BBEG immortal short of making them a Lich or Vampire?

Prove that NP is closed under karp reduction?

Why don't electron-positron collisions release infinite energy?



How to combine the two rows of a dataset into a single row in spark using java


How do I efficiently iterate over each entry in a Java Map?How can I concatenate two arrays in Java?How do I call one constructor from another in Java?How do I read / convert an InputStream into a String in Java?How do I generate random integers within a specific range in Java?How to get an enum value from a string value in Java?How do I determine whether an array contains a particular value in Java?How do I declare and initialize an array in Java?How to split a string in JavaHow do I convert a String to an int in Java?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








-1
















I am reading the transactions from a kafka topic in json format. then
i applied some transformations to get the aggregations based on the
txn_status . Below is the schema.



root |-- window: struct (nullable = true) | |-- start: timestamp
(nullable = true) | |-- end: timestamp (nullable = true) |--
txn_status: string (nullable = true) |-- count: long (nullable =
false)



My batch output is like below after applying grouping for the given
window. [![enter image description here][1]][1]



but i want the output like below json format.




“start_end_time”: “28/12/2018 11:32:00.000”,
“count_Total” : 6
“count_RCVD” : 5,
“count_FAILED”: 1



> how to combine two rows in a spark dataset.
>
>
> [1]: https://i.stack.imgur.com/sCJuX.jpg










share|improve this question






























    -1
















    I am reading the transactions from a kafka topic in json format. then
    i applied some transformations to get the aggregations based on the
    txn_status . Below is the schema.



    root |-- window: struct (nullable = true) | |-- start: timestamp
    (nullable = true) | |-- end: timestamp (nullable = true) |--
    txn_status: string (nullable = true) |-- count: long (nullable =
    false)



    My batch output is like below after applying grouping for the given
    window. [![enter image description here][1]][1]



    but i want the output like below json format.




    “start_end_time”: “28/12/2018 11:32:00.000”,
    “count_Total” : 6
    “count_RCVD” : 5,
    “count_FAILED”: 1



    > how to combine two rows in a spark dataset.
    >
    >
    > [1]: https://i.stack.imgur.com/sCJuX.jpg










    share|improve this question


























      -1












      -1








      -1









      I am reading the transactions from a kafka topic in json format. then
      i applied some transformations to get the aggregations based on the
      txn_status . Below is the schema.



      root |-- window: struct (nullable = true) | |-- start: timestamp
      (nullable = true) | |-- end: timestamp (nullable = true) |--
      txn_status: string (nullable = true) |-- count: long (nullable =
      false)



      My batch output is like below after applying grouping for the given
      window. [![enter image description here][1]][1]



      but i want the output like below json format.




      “start_end_time”: “28/12/2018 11:32:00.000”,
      “count_Total” : 6
      “count_RCVD” : 5,
      “count_FAILED”: 1



      > how to combine two rows in a spark dataset.
      >
      >
      > [1]: https://i.stack.imgur.com/sCJuX.jpg










      share|improve this question

















      I am reading the transactions from a kafka topic in json format. then
      i applied some transformations to get the aggregations based on the
      txn_status . Below is the schema.



      root |-- window: struct (nullable = true) | |-- start: timestamp
      (nullable = true) | |-- end: timestamp (nullable = true) |--
      txn_status: string (nullable = true) |-- count: long (nullable =
      false)



      My batch output is like below after applying grouping for the given
      window. [![enter image description here][1]][1]



      but i want the output like below json format.




      “start_end_time”: “28/12/2018 11:32:00.000”,
      “count_Total” : 6
      “count_RCVD” : 5,
      “count_FAILED”: 1



      > how to combine two rows in a spark dataset.
      >
      >
      > [1]: https://i.stack.imgur.com/sCJuX.jpg







      java apache-spark dataset






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 8 at 7:19







      Swetha

















      asked Mar 8 at 4:07









      SwethaSwetha

      11




      11






















          1 Answer
          1






          active

          oldest

          votes


















          0














          As per the image you have shown, I have created a data frame or a temp table and provided the solution for your question.



          Scala Code:



          case class txn_rec(txn_status: String, count: Int, start_end_time: String)

          var txDf=sc.parallelize(Array(new txn_rec("FAIL",9,"2019-03-08 016:40:00, 2019-03-08 016:57:00"),
          new txn_rec("RCVD",161,"2019-03-08 016:40:00, 2019-03-08 016:57:00"))).toDF

          txDf.createOrReplaceTempView("temp")

          var resDF=spark.sql("select start_end_time, (select sum(count) from temp) as total_count , (select count from temp where txn_status='RCVD') as rcvd_count,(select count from temp where txn_status='FAIL') as failed_count from temp group by start_end_time")

          resDF.show

          resDF.toJSON.collectAsList.toString


          You can see the output as shown in the screen shot.



          Output-1



          Output-2






          share|improve this answer























          • Thanks you. I am new to spark. I want to use Dataset to impliment this in java, i didn't find parallelize in dataframe functions. Can we get it done in java.

            – Swetha
            Mar 8 at 6:19











          • Can you tell me more about how input batch data is represented? How you want to convert into output.

            – Sasi
            Mar 8 at 6:49











          • i have edited the post , can you check now.

            – Swetha
            Mar 8 at 7:20











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55056587%2fhow-to-combine-the-two-rows-of-a-dataset-into-a-single-row-in-spark-using-java%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          As per the image you have shown, I have created a data frame or a temp table and provided the solution for your question.



          Scala Code:



          case class txn_rec(txn_status: String, count: Int, start_end_time: String)

          var txDf=sc.parallelize(Array(new txn_rec("FAIL",9,"2019-03-08 016:40:00, 2019-03-08 016:57:00"),
          new txn_rec("RCVD",161,"2019-03-08 016:40:00, 2019-03-08 016:57:00"))).toDF

          txDf.createOrReplaceTempView("temp")

          var resDF=spark.sql("select start_end_time, (select sum(count) from temp) as total_count , (select count from temp where txn_status='RCVD') as rcvd_count,(select count from temp where txn_status='FAIL') as failed_count from temp group by start_end_time")

          resDF.show

          resDF.toJSON.collectAsList.toString


          You can see the output as shown in the screen shot.



          Output-1



          Output-2






          share|improve this answer























          • Thanks you. I am new to spark. I want to use Dataset to impliment this in java, i didn't find parallelize in dataframe functions. Can we get it done in java.

            – Swetha
            Mar 8 at 6:19











          • Can you tell me more about how input batch data is represented? How you want to convert into output.

            – Sasi
            Mar 8 at 6:49











          • i have edited the post , can you check now.

            – Swetha
            Mar 8 at 7:20















          0














          As per the image you have shown, I have created a data frame or a temp table and provided the solution for your question.



          Scala Code:



          case class txn_rec(txn_status: String, count: Int, start_end_time: String)

          var txDf=sc.parallelize(Array(new txn_rec("FAIL",9,"2019-03-08 016:40:00, 2019-03-08 016:57:00"),
          new txn_rec("RCVD",161,"2019-03-08 016:40:00, 2019-03-08 016:57:00"))).toDF

          txDf.createOrReplaceTempView("temp")

          var resDF=spark.sql("select start_end_time, (select sum(count) from temp) as total_count , (select count from temp where txn_status='RCVD') as rcvd_count,(select count from temp where txn_status='FAIL') as failed_count from temp group by start_end_time")

          resDF.show

          resDF.toJSON.collectAsList.toString


          You can see the output as shown in the screen shot.



          Output-1



          Output-2






          share|improve this answer























          • Thanks you. I am new to spark. I want to use Dataset to impliment this in java, i didn't find parallelize in dataframe functions. Can we get it done in java.

            – Swetha
            Mar 8 at 6:19











          • Can you tell me more about how input batch data is represented? How you want to convert into output.

            – Sasi
            Mar 8 at 6:49











          • i have edited the post , can you check now.

            – Swetha
            Mar 8 at 7:20













          0












          0








          0







          As per the image you have shown, I have created a data frame or a temp table and provided the solution for your question.



          Scala Code:



          case class txn_rec(txn_status: String, count: Int, start_end_time: String)

          var txDf=sc.parallelize(Array(new txn_rec("FAIL",9,"2019-03-08 016:40:00, 2019-03-08 016:57:00"),
          new txn_rec("RCVD",161,"2019-03-08 016:40:00, 2019-03-08 016:57:00"))).toDF

          txDf.createOrReplaceTempView("temp")

          var resDF=spark.sql("select start_end_time, (select sum(count) from temp) as total_count , (select count from temp where txn_status='RCVD') as rcvd_count,(select count from temp where txn_status='FAIL') as failed_count from temp group by start_end_time")

          resDF.show

          resDF.toJSON.collectAsList.toString


          You can see the output as shown in the screen shot.



          Output-1



          Output-2






          share|improve this answer













          As per the image you have shown, I have created a data frame or a temp table and provided the solution for your question.



          Scala Code:



          case class txn_rec(txn_status: String, count: Int, start_end_time: String)

          var txDf=sc.parallelize(Array(new txn_rec("FAIL",9,"2019-03-08 016:40:00, 2019-03-08 016:57:00"),
          new txn_rec("RCVD",161,"2019-03-08 016:40:00, 2019-03-08 016:57:00"))).toDF

          txDf.createOrReplaceTempView("temp")

          var resDF=spark.sql("select start_end_time, (select sum(count) from temp) as total_count , (select count from temp where txn_status='RCVD') as rcvd_count,(select count from temp where txn_status='FAIL') as failed_count from temp group by start_end_time")

          resDF.show

          resDF.toJSON.collectAsList.toString


          You can see the output as shown in the screen shot.



          Output-1



          Output-2







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 8 at 5:27









          SasiSasi

          407




          407












          • Thanks you. I am new to spark. I want to use Dataset to impliment this in java, i didn't find parallelize in dataframe functions. Can we get it done in java.

            – Swetha
            Mar 8 at 6:19











          • Can you tell me more about how input batch data is represented? How you want to convert into output.

            – Sasi
            Mar 8 at 6:49











          • i have edited the post , can you check now.

            – Swetha
            Mar 8 at 7:20

















          • Thanks you. I am new to spark. I want to use Dataset to impliment this in java, i didn't find parallelize in dataframe functions. Can we get it done in java.

            – Swetha
            Mar 8 at 6:19











          • Can you tell me more about how input batch data is represented? How you want to convert into output.

            – Sasi
            Mar 8 at 6:49











          • i have edited the post , can you check now.

            – Swetha
            Mar 8 at 7:20
















          Thanks you. I am new to spark. I want to use Dataset to impliment this in java, i didn't find parallelize in dataframe functions. Can we get it done in java.

          – Swetha
          Mar 8 at 6:19





          Thanks you. I am new to spark. I want to use Dataset to impliment this in java, i didn't find parallelize in dataframe functions. Can we get it done in java.

          – Swetha
          Mar 8 at 6:19













          Can you tell me more about how input batch data is represented? How you want to convert into output.

          – Sasi
          Mar 8 at 6:49





          Can you tell me more about how input batch data is represented? How you want to convert into output.

          – Sasi
          Mar 8 at 6:49













          i have edited the post , can you check now.

          – Swetha
          Mar 8 at 7:20





          i have edited the post , can you check now.

          – Swetha
          Mar 8 at 7:20



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55056587%2fhow-to-combine-the-two-rows-of-a-dataset-into-a-single-row-in-spark-using-java%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          AWS Lex not identifying response if by a variable The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceEnforcing custom enumeration in AWS LEX for slot valuesHow to give response based on user response in Amazon Lex?Intercepting AWS Lambda Response to a AWS Lex QueryLex chat bot error: Reached second execution of fulfillment lambda on the same utteranceamazon lex showing invalid responseLambda response send back to Lex slot?Response card in Amazon lexAmazon Lex - Lambda response return HTML to botHow can I solve 424 (Failed Dependency) (python) obtained from Amazon lex?

          Алба-Юлія

          Захаров Федір Захарович