Repeating a string based on a column value (like multiplication of a string and a number in python)2019 Community Moderator ElectionUsing a column value as a parameter to a spark DataFrame functionHow to concatenate text from multiple rows into a single text string in SQL server?How do I check if a string is a number (float)?How do I parse a string to a float or int in Python?Reverse a string in PythonConverting integer to string in Python?How to get the number of elements in a list in Python?Does Python have a string 'contains' substring method?How to lowercase a string in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandas
Professor forcing me to attend a conference, I can't afford even with 50% funding
Outlet with 3 sets of wires
Specifying a starting column with colortbl package and xcolor
I reported the illegal activity of my boss to his boss. My boss found out. Now I am being punished. What should I do?
Can I negotiate a patent idea for a raise, under French law?
Is it safe to abruptly remove Arduino power?
Doesn't allowing a user mode program to access kernel space memory and execute the IN and OUT instructions defeat the purpose of having CPU modes?
How can I get players to focus on the story aspect of D&D?
In the late 1940’s to early 1950’s what technology was available that could melt a LOT of ice?
Why couldn't the separatists legally leave the Republic?
Haman going to the second feast dirty
What's the 'present simple' form of the word "нашла́" in 3rd person singular female?
Having the player face themselves after the mid-game
Does a difference of tense count as a difference of meaning in a minimal pair?
Did Amazon pay $0 in taxes last year?
Why does Central Limit Theorem break down in my simulation?
Source permutation
What problems would a superhuman have who's skin is constantly hot?
What are some noteworthy "mic-drop" moments in math?
Is it possible that a question has only two answers?
Help find my computational error for logarithms
What can I do if someone tampers with my SSH public key?
What is Tony Stark injecting into himself in Iron Man 3?
Are all players supposed to be able to see each others' character sheets?
Repeating a string based on a column value (like multiplication of a string and a number in python)
2019 Community Moderator ElectionUsing a column value as a parameter to a spark DataFrame functionHow to concatenate text from multiple rows into a single text string in SQL server?How do I check if a string is a number (float)?How do I parse a string to a float or int in Python?Reverse a string in PythonConverting integer to string in Python?How to get the number of elements in a list in Python?Does Python have a string 'contains' substring method?How to lowercase a string in Python?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandas
I have the following data frame (called df
) with columns item_name
and item_level
:
item_name item_level
----------------------------
Item1 1
Item2 2
Item3 2
Item4 3
I would like to create a new column which produces indentdation of the items, depending on their level. To do that, I would like to multiply the item_level by the string '---', with the idea that when I do that the string gets concatenated with itself as many times as the value of the integer I am multiplying the string with.
My desired result is something like this:
item_name item_level new_column
------------------------------------------------
Item1 1 ---Item1
Item2 2 ------Item2
Item3 2 ------Item3
Item4 3 ---------Item4
In pyspark when I write the following command, the created column contains only null
values:
from pyspark.sql import functions as F
df = df.withColumn('new_column',F.concat(F.lit(df.item_level*'---'),df.item_name))
The null
values seem to come from the multiplication of the integers with the string. The concat
function seems to work properly. For instance, the following works:
df = df.withColumn('new_column',F.concat(df.item_name,df.item_name))
I also tried few other things. If I use a constant number to multiply the string, the resulting string is displayed as wished:
number = 3
df = df.withColumn('new_column', F.lit(number*'---'))
Furthermore, adding the '---' string first in a column (with identical rows '---'), and then multiplying that column with the item_level column gives null values as well:
df = df.withColumn('padding',F.lit('---'))
df = df.withColumn('test',df.padding*df.item_name)
If I use pandas, however, this last piece of code does what I want. But I need to do this in pyspark.
python apache-spark pyspark apache-spark-sql string-concatenation
New contributor
add a comment |
I have the following data frame (called df
) with columns item_name
and item_level
:
item_name item_level
----------------------------
Item1 1
Item2 2
Item3 2
Item4 3
I would like to create a new column which produces indentdation of the items, depending on their level. To do that, I would like to multiply the item_level by the string '---', with the idea that when I do that the string gets concatenated with itself as many times as the value of the integer I am multiplying the string with.
My desired result is something like this:
item_name item_level new_column
------------------------------------------------
Item1 1 ---Item1
Item2 2 ------Item2
Item3 2 ------Item3
Item4 3 ---------Item4
In pyspark when I write the following command, the created column contains only null
values:
from pyspark.sql import functions as F
df = df.withColumn('new_column',F.concat(F.lit(df.item_level*'---'),df.item_name))
The null
values seem to come from the multiplication of the integers with the string. The concat
function seems to work properly. For instance, the following works:
df = df.withColumn('new_column',F.concat(df.item_name,df.item_name))
I also tried few other things. If I use a constant number to multiply the string, the resulting string is displayed as wished:
number = 3
df = df.withColumn('new_column', F.lit(number*'---'))
Furthermore, adding the '---' string first in a column (with identical rows '---'), and then multiplying that column with the item_level column gives null values as well:
df = df.withColumn('padding',F.lit('---'))
df = df.withColumn('test',df.padding*df.item_name)
If I use pandas, however, this last piece of code does what I want. But I need to do this in pyspark.
python apache-spark pyspark apache-spark-sql string-concatenation
New contributor
add a comment |
I have the following data frame (called df
) with columns item_name
and item_level
:
item_name item_level
----------------------------
Item1 1
Item2 2
Item3 2
Item4 3
I would like to create a new column which produces indentdation of the items, depending on their level. To do that, I would like to multiply the item_level by the string '---', with the idea that when I do that the string gets concatenated with itself as many times as the value of the integer I am multiplying the string with.
My desired result is something like this:
item_name item_level new_column
------------------------------------------------
Item1 1 ---Item1
Item2 2 ------Item2
Item3 2 ------Item3
Item4 3 ---------Item4
In pyspark when I write the following command, the created column contains only null
values:
from pyspark.sql import functions as F
df = df.withColumn('new_column',F.concat(F.lit(df.item_level*'---'),df.item_name))
The null
values seem to come from the multiplication of the integers with the string. The concat
function seems to work properly. For instance, the following works:
df = df.withColumn('new_column',F.concat(df.item_name,df.item_name))
I also tried few other things. If I use a constant number to multiply the string, the resulting string is displayed as wished:
number = 3
df = df.withColumn('new_column', F.lit(number*'---'))
Furthermore, adding the '---' string first in a column (with identical rows '---'), and then multiplying that column with the item_level column gives null values as well:
df = df.withColumn('padding',F.lit('---'))
df = df.withColumn('test',df.padding*df.item_name)
If I use pandas, however, this last piece of code does what I want. But I need to do this in pyspark.
python apache-spark pyspark apache-spark-sql string-concatenation
New contributor
I have the following data frame (called df
) with columns item_name
and item_level
:
item_name item_level
----------------------------
Item1 1
Item2 2
Item3 2
Item4 3
I would like to create a new column which produces indentdation of the items, depending on their level. To do that, I would like to multiply the item_level by the string '---', with the idea that when I do that the string gets concatenated with itself as many times as the value of the integer I am multiplying the string with.
My desired result is something like this:
item_name item_level new_column
------------------------------------------------
Item1 1 ---Item1
Item2 2 ------Item2
Item3 2 ------Item3
Item4 3 ---------Item4
In pyspark when I write the following command, the created column contains only null
values:
from pyspark.sql import functions as F
df = df.withColumn('new_column',F.concat(F.lit(df.item_level*'---'),df.item_name))
The null
values seem to come from the multiplication of the integers with the string. The concat
function seems to work properly. For instance, the following works:
df = df.withColumn('new_column',F.concat(df.item_name,df.item_name))
I also tried few other things. If I use a constant number to multiply the string, the resulting string is displayed as wished:
number = 3
df = df.withColumn('new_column', F.lit(number*'---'))
Furthermore, adding the '---' string first in a column (with identical rows '---'), and then multiplying that column with the item_level column gives null values as well:
df = df.withColumn('padding',F.lit('---'))
df = df.withColumn('test',df.padding*df.item_name)
If I use pandas, however, this last piece of code does what I want. But I need to do this in pyspark.
python apache-spark pyspark apache-spark-sql string-concatenation
python apache-spark pyspark apache-spark-sql string-concatenation
New contributor
New contributor
edited Mar 6 at 15:54
pault
16k32552
16k32552
New contributor
asked Mar 6 at 14:46
Irena KuzmanovskaIrena Kuzmanovska
62
62
New contributor
New contributor
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
There is a function pyspark.sql.functions.repeat
that:
Repeats a string column n times, and returns it as a new string column.
Concatenate the result of repeat
with the item_name
as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr
in order to pass a column value as an argument to a spark function.
from pyspark.sql.functions import concat, expr
df.withColumn(
"new_column",
concat(expr("repeat('---', item_level)"), "item_name")
).show()
#+---------+----------+--------------+
#|item_name|item_level| new_column|
#+---------+----------+--------------+
#| Item1| 1| ---Item1|
#| Item2| 2| ------Item2|
#| Item3| 2| ------Item3|
#| Item4| 3|---------Item4|
#+---------+----------+--------------+
Note that show()
will right justify the output that is displayed, but the underlying data is as you desired.
Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!
– Irena Kuzmanovska
Mar 7 at 9:28
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55025809%2frepeating-a-string-based-on-a-column-value-like-multiplication-of-a-string-and%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
There is a function pyspark.sql.functions.repeat
that:
Repeats a string column n times, and returns it as a new string column.
Concatenate the result of repeat
with the item_name
as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr
in order to pass a column value as an argument to a spark function.
from pyspark.sql.functions import concat, expr
df.withColumn(
"new_column",
concat(expr("repeat('---', item_level)"), "item_name")
).show()
#+---------+----------+--------------+
#|item_name|item_level| new_column|
#+---------+----------+--------------+
#| Item1| 1| ---Item1|
#| Item2| 2| ------Item2|
#| Item3| 2| ------Item3|
#| Item4| 3|---------Item4|
#+---------+----------+--------------+
Note that show()
will right justify the output that is displayed, but the underlying data is as you desired.
Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!
– Irena Kuzmanovska
Mar 7 at 9:28
add a comment |
There is a function pyspark.sql.functions.repeat
that:
Repeats a string column n times, and returns it as a new string column.
Concatenate the result of repeat
with the item_name
as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr
in order to pass a column value as an argument to a spark function.
from pyspark.sql.functions import concat, expr
df.withColumn(
"new_column",
concat(expr("repeat('---', item_level)"), "item_name")
).show()
#+---------+----------+--------------+
#|item_name|item_level| new_column|
#+---------+----------+--------------+
#| Item1| 1| ---Item1|
#| Item2| 2| ------Item2|
#| Item3| 2| ------Item3|
#| Item4| 3|---------Item4|
#+---------+----------+--------------+
Note that show()
will right justify the output that is displayed, but the underlying data is as you desired.
Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!
– Irena Kuzmanovska
Mar 7 at 9:28
add a comment |
There is a function pyspark.sql.functions.repeat
that:
Repeats a string column n times, and returns it as a new string column.
Concatenate the result of repeat
with the item_name
as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr
in order to pass a column value as an argument to a spark function.
from pyspark.sql.functions import concat, expr
df.withColumn(
"new_column",
concat(expr("repeat('---', item_level)"), "item_name")
).show()
#+---------+----------+--------------+
#|item_name|item_level| new_column|
#+---------+----------+--------------+
#| Item1| 1| ---Item1|
#| Item2| 2| ------Item2|
#| Item3| 2| ------Item3|
#| Item4| 3|---------Item4|
#+---------+----------+--------------+
Note that show()
will right justify the output that is displayed, but the underlying data is as you desired.
There is a function pyspark.sql.functions.repeat
that:
Repeats a string column n times, and returns it as a new string column.
Concatenate the result of repeat
with the item_name
as you were doing in your code. The only wrinkle is that you need to use pyspark.sql.functions.expr
in order to pass a column value as an argument to a spark function.
from pyspark.sql.functions import concat, expr
df.withColumn(
"new_column",
concat(expr("repeat('---', item_level)"), "item_name")
).show()
#+---------+----------+--------------+
#|item_name|item_level| new_column|
#+---------+----------+--------------+
#| Item1| 1| ---Item1|
#| Item2| 2| ------Item2|
#| Item3| 2| ------Item3|
#| Item4| 3|---------Item4|
#+---------+----------+--------------+
Note that show()
will right justify the output that is displayed, but the underlying data is as you desired.
answered Mar 6 at 15:51
paultpault
16k32552
16k32552
Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!
– Irena Kuzmanovska
Mar 7 at 9:28
add a comment |
Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!
– Irena Kuzmanovska
Mar 7 at 9:28
Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!
– Irena Kuzmanovska
Mar 7 at 9:28
Thanks so much! This actually does the job! I was struggling so much to find the right way, and this is perfect!
– Irena Kuzmanovska
Mar 7 at 9:28
add a comment |
Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.
Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.
Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.
Irena Kuzmanovska is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55025809%2frepeating-a-string-based-on-a-column-value-like-multiplication-of-a-string-and%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown