Repeating a string based on a column value (like multiplication of a string and a number in python)2019 Community Moderator ElectionUsing a column value as a parameter to a spark DataFrame functionHow to concatenate text from multiple rows into a single text string in SQL server?How do I check if a string is a number (float)?How do I parse a string to a float or int in Python?Reverse a string in PythonConverting integer to string in Python?How to get the number of elements in a list in Python?Does Python have a string 'contains' substring method?How to lowercase a string in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandas

Professor forcing me to attend a conference, I can't afford even with 50% funding

Outlet with 3 sets of wires

Specifying a starting column with colortbl package and xcolor

I reported the illegal activity of my boss to his boss. My boss found out. Now I am being punished. What should I do?

Can I negotiate a patent idea for a raise, under French law?

Is it safe to abruptly remove Arduino power?

Doesn't allowing a user mode program to access kernel space memory and execute the IN and OUT instructions defeat the purpose of having CPU modes?

How can I get players to focus on the story aspect of D&D?

In the late 1940’s to early 1950’s what technology was available that could melt a LOT of ice?

Why couldn't the separatists legally leave the Republic?

Haman going to the second feast dirty

What's the 'present simple' form of the word "нашла́" in 3rd person singular female?

Having the player face themselves after the mid-game

Does a difference of tense count as a difference of meaning in a minimal pair?

Did Amazon pay $0 in taxes last year?

Why does Central Limit Theorem break down in my simulation?

Source permutation

What problems would a superhuman have who's skin is constantly hot?

What are some noteworthy "mic-drop" moments in math?

Is it possible that a question has only two answers?

Help find my computational error for logarithms

What can I do if someone tampers with my SSH public key?

What is Tony Stark injecting into himself in Iron Man 3?

Are all players supposed to be able to see each others' character sheets?



Repeating a string based on a column value (like multiplication of a string and a number in python)



2019 Community Moderator ElectionUsing a column value as a parameter to a spark DataFrame functionHow to concatenate text from multiple rows into a single text string in SQL server?How do I check if a string is a number (float)?How do I parse a string to a float or int in Python?Reverse a string in PythonConverting integer to string in Python?How to get the number of elements in a list in Python?Does Python have a string 'contains' substring method?How to lowercase a string in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandas










1

















I have the following data frame (called df) with columns item_name and item_level:



 item_name item_level 
----------------------------
Item1 1
Item2 2
Item3 2
Item4 3


I would like to create a new column which produces indentdation of the items, depending on their level. To do that, I would like to multiply the item_level by the string '---', with the idea that when I do that the string gets concatenated with itself as many times as the value of the integer I am multiplying the string with.



My desired result is something like this:



 item_name item_level new_column
------------------------------------------------
Item1 1 ---Item1
Item2 2 ------Item2
Item3 2 ------Item3
Item4 3 ---------Item4


In pyspark when I write the following command, the created column contains only null values:



from pyspark.sql import functions as F
df = df.withColumn('new_column',F.concat(F.lit(df.item_level*'---'),df.item_name))


The null values seem to come from the multiplication of the integers with the string. The concat function seems to work properly. For instance, the following works:



df = df.withColumn('new_column',F.concat(df.item_name,df.item_name))


I also tried few other things. If I use a constant number to multiply the string, the resulting string is displayed as wished:



number = 3
df = df.withColumn('new_column', F.lit(number*'---'))


Furthermore, adding the '---' string first in a column (with identical rows '---'), and then multiplying that column with the item_level column gives null values as well:



df = df.withColumn('padding',F.lit('---'))
df = df.withColumn('test',df.padding*df.item_name)


If I use pandas, however, this last piece of code does what I want. But I need to do this in pyspark.










share|improve this question









New contributor




Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
























    1

















    I have the following data frame (called df) with columns item_name and item_level:



     item_name item_level 
    ----------------------------
    Item1 1
    Item2 2
    Item3 2
    Item4 3


    I would like to create a new column which produces indentdation of the items, depending on their level. To do that, I would like to multiply the item_level by the string '---', with the idea that when I do that the string gets concatenated with itself as many times as the value of the integer I am multiplying the string with.



    My desired result is something like this:



     item_name item_level new_column
    ------------------------------------------------
    Item1 1 ---Item1
    Item2 2 ------Item2
    Item3 2 ------Item3
    Item4 3 ---------Item4


    In pyspark when I write the following command, the created column contains only null values:



    from pyspark.sql import functions as F
    df = df.withColumn('new_column',F.concat(F.lit(df.item_level*'---'),df.item_name))


    The null values seem to come from the multiplication of the integers with the string. The concat function seems to work properly. For instance, the following works:



    df = df.withColumn('new_column',F.concat(df.item_name,df.item_name))


    I also tried few other things. If I use a constant number to multiply the string, the resulting string is displayed as wished:



    number = 3
    df = df.withColumn('new_column', F.lit(number*'---'))


    Furthermore, adding the '---' string first in a column (with identical rows '---'), and then multiplying that column with the item_level column gives null values as well:



    df = df.withColumn('padding',F.lit('---'))
    df = df.withColumn('test',df.padding*df.item_name)


    If I use pandas, however, this last piece of code does what I want. But I need to do this in pyspark.










    share|improve this question









    New contributor




    Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      1












      1








      1










      I have the following data frame (called df) with columns item_name and item_level:



       item_name item_level 
      ----------------------------
      Item1 1
      Item2 2
      Item3 2
      Item4 3


      I would like to create a new column which produces indentdation of the items, depending on their level. To do that, I would like to multiply the item_level by the string '---', with the idea that when I do that the string gets concatenated with itself as many times as the value of the integer I am multiplying the string with.



      My desired result is something like this:



       item_name item_level new_column
      ------------------------------------------------
      Item1 1 ---Item1
      Item2 2 ------Item2
      Item3 2 ------Item3
      Item4 3 ---------Item4


      In pyspark when I write the following command, the created column contains only null values:



      from pyspark.sql import functions as F
      df = df.withColumn('new_column',F.concat(F.lit(df.item_level*'---'),df.item_name))


      The null values seem to come from the multiplication of the integers with the string. The concat function seems to work properly. For instance, the following works:



      df = df.withColumn('new_column',F.concat(df.item_name,df.item_name))


      I also tried few other things. If I use a constant number to multiply the string, the resulting string is displayed as wished:



      number = 3
      df = df.withColumn('new_column', F.lit(number*'---'))


      Furthermore, adding the '---' string first in a column (with identical rows '---'), and then multiplying that column with the item_level column gives null values as well:



      df = df.withColumn('padding',F.lit('---'))
      df = df.withColumn('test',df.padding*df.item_name)


      If I use pandas, however, this last piece of code does what I want. But I need to do this in pyspark.










      share|improve this question









      New contributor




      Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.














      I have the following data frame (called df) with columns item_name and item_level:



       item_name item_level 
      ----------------------------
      Item1 1
      Item2 2
      Item3 2
      Item4 3


      I would like to create a new column which produces indentdation of the items, depending on their level. To do that, I would like to multiply the item_level by the string '---', with the idea that when I do that the string gets concatenated with itself as many times as the value of the integer I am multiplying the string with.



      My desired result is something like this:



       item_name item_level new_column
      ------------------------------------------------
      Item1 1 ---Item1
      Item2 2 ------Item2
      Item3 2 ------Item3
      Item4 3 ---------Item4


      In pyspark when I write the following command, the created column contains only null values:



      from pyspark.sql import functions as F
      df = df.withColumn('new_column',F.concat(F.lit(df.item_level*'---'),df.item_name))


      The null values seem to come from the multiplication of the integers with the string. The concat function seems to work properly. For instance, the following works:



      df = df.withColumn('new_column',F.concat(df.item_name,df.item_name))


      I also tried few other things. If I use a constant number to multiply the string, the resulting string is displayed as wished:



      number = 3
      df = df.withColumn('new_column', F.lit(number*'---'))


      Furthermore, adding the '---' string first in a column (with identical rows '---'), and then multiplying that column with the item_level column gives null values as well:



      df = df.withColumn('padding',F.lit('---'))
      df = df.withColumn('test',df.padding*df.item_name)


      If I use pandas, however, this last piece of code does what I want. But I need to do this in pyspark.







      python apache-spark pyspark apache-spark-sql string-concatenation






      share|improve this question









      New contributor




      Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited Mar 6 at 15:54









      pault

      16k32552




      16k32552






      New contributor




      Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Mar 6 at 14:46









      Irena KuzmanovskaIrena Kuzmanovska

      62




      62




      New contributor




      Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Irena Kuzmanovska is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          1 Answer
          1






          active

          oldest

          votes


















          1
















          There is a function pyspark.sql.functions.repeat that:




          Repeats a string column n times, and returns it as a new string column.




          Concatenate the result of repeat with the item_name as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr in order to pass a column value as an argument to a spark function.



          from pyspark.sql.functions import concat, expr

          df.withColumn(
          "new_column",
          concat(expr("repeat('---', item_level)"), "item_name")
          ).show()
          #+---------+----------+--------------+
          #|item_name|item_level| new_column|
          #+---------+----------+--------------+
          #| Item1| 1| ---Item1|
          #| Item2| 2| ------Item2|
          #| Item3| 2| ------Item3|
          #| Item4| 3|---------Item4|
          #+---------+----------+--------------+


          Note that show() will right justify the output that is displayed, but the underlying data is as you desired.






          share|improve this answer























          • Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!

            – Irena Kuzmanovska
            Mar 7 at 9:28










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55025809%2frepeating-a-string-based-on-a-column-value-like-multiplication-of-a-string-and%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1
















          There is a function pyspark.sql.functions.repeat that:




          Repeats a string column n times, and returns it as a new string column.




          Concatenate the result of repeat with the item_name as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr in order to pass a column value as an argument to a spark function.



          from pyspark.sql.functions import concat, expr

          df.withColumn(
          "new_column",
          concat(expr("repeat('---', item_level)"), "item_name")
          ).show()
          #+---------+----------+--------------+
          #|item_name|item_level| new_column|
          #+---------+----------+--------------+
          #| Item1| 1| ---Item1|
          #| Item2| 2| ------Item2|
          #| Item3| 2| ------Item3|
          #| Item4| 3|---------Item4|
          #+---------+----------+--------------+


          Note that show() will right justify the output that is displayed, but the underlying data is as you desired.






          share|improve this answer























          • Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!

            – Irena Kuzmanovska
            Mar 7 at 9:28















          1
















          There is a function pyspark.sql.functions.repeat that:




          Repeats a string column n times, and returns it as a new string column.




          Concatenate the result of repeat with the item_name as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr in order to pass a column value as an argument to a spark function.



          from pyspark.sql.functions import concat, expr

          df.withColumn(
          "new_column",
          concat(expr("repeat('---', item_level)"), "item_name")
          ).show()
          #+---------+----------+--------------+
          #|item_name|item_level| new_column|
          #+---------+----------+--------------+
          #| Item1| 1| ---Item1|
          #| Item2| 2| ------Item2|
          #| Item3| 2| ------Item3|
          #| Item4| 3|---------Item4|
          #+---------+----------+--------------+


          Note that show() will right justify the output that is displayed, but the underlying data is as you desired.






          share|improve this answer























          • Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!

            – Irena Kuzmanovska
            Mar 7 at 9:28













          1












          1








          1









          There is a function pyspark.sql.functions.repeat that:




          Repeats a string column n times, and returns it as a new string column.




          Concatenate the result of repeat with the item_name as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr in order to pass a column value as an argument to a spark function.



          from pyspark.sql.functions import concat, expr

          df.withColumn(
          "new_column",
          concat(expr("repeat('---', item_level)"), "item_name")
          ).show()
          #+---------+----------+--------------+
          #|item_name|item_level| new_column|
          #+---------+----------+--------------+
          #| Item1| 1| ---Item1|
          #| Item2| 2| ------Item2|
          #| Item3| 2| ------Item3|
          #| Item4| 3|---------Item4|
          #+---------+----------+--------------+


          Note that show() will right justify the output that is displayed, but the underlying data is as you desired.






          share|improve this answer















          There is a function pyspark.sql.functions.repeat that:




          Repeats a string column n times, and returns it as a new string column.




          Concatenate the result of repeat with the item_name as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr in order to pass a column value as an argument to a spark function.



          from pyspark.sql.functions import concat, expr

          df.withColumn(
          "new_column",
          concat(expr("repeat('---', item_level)"), "item_name")
          ).show()
          #+---------+----------+--------------+
          #|item_name|item_level| new_column|
          #+---------+----------+--------------+
          #| Item1| 1| ---Item1|
          #| Item2| 2| ------Item2|
          #| Item3| 2| ------Item3|
          #| Item4| 3|---------Item4|
          #+---------+----------+--------------+


          Note that show() will right justify the output that is displayed, but the underlying data is as you desired.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 6 at 15:51









          paultpault

          16k32552




          16k32552












          • Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!

            – Irena Kuzmanovska
            Mar 7 at 9:28

















          • Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!

            – Irena Kuzmanovska
            Mar 7 at 9:28
















          Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!

          – Irena Kuzmanovska
          Mar 7 at 9:28





          Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!

          – Irena Kuzmanovska
          Mar 7 at 9:28












          Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.












          Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.











          Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.














          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55025809%2frepeating-a-string-based-on-a-column-value-like-multiplication-of-a-string-and%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Save data to MySQL database using ExtJS and PHP [closed]2019 Community Moderator ElectionHow can I prevent SQL injection in PHP?Which MySQL data type to use for storing boolean valuesPHP: Delete an element from an arrayHow do I connect to a MySQL Database in Python?Should I use the datetime or timestamp data type in MySQL?How to get a list of MySQL user accountsHow Do You Parse and Process HTML/XML in PHP?Reference — What does this symbol mean in PHP?How does PHP 'foreach' actually work?Why shouldn't I use mysql_* functions in PHP?

          Compiling GNU Global with universal-ctags support Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Tags for Emacs: Relationship between etags, ebrowse, cscope, GNU Global and exuberant ctagsVim and Ctags tips and trickscscope or ctags why choose one over the other?scons and ctagsctags cannot open option file “.ctags”Adding tag scopes in universal-ctagsShould I use Universal-ctags?Universal ctags on WindowsHow do I install GNU Global with universal ctags support using Homebrew?Universal ctags with emacsHow to highlight ctags generated by Universal Ctags in Vim?

          Add ONERROR event to image from jsp tldHow to add an image to a JPanel?Saving image from PHP URLHTML img scalingCheck if an image is loaded (no errors) with jQueryHow to force an <img> to take up width, even if the image is not loadedHow do I populate hidden form field with a value set in Spring ControllerStyling Raw elements Generated from JSP tagds with Jquery MobileLimit resizing of images with explicitly set width and height attributeserror TLD use in a jsp fileJsp tld files cannot be resolved