pandas convert columns to percentages of the totalsPython Pandas - Convert column to percentage on Groupby DFConverting string into datetimeWhy is reading lines from stdin much slower in C++ than Python?Selecting multiple columns in a pandas dataframeRenaming columns in pandasDelete column from pandas DataFrame by column name“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasConvert Python dict into a dataframeImport multiple csv files into pandas and concatenate into one DataFramePandas percentage of total with groupby
Can I cause damage to electrical appliances by unplugging them when they are turned on?
If A is dense in Q, then it must be dense in R.
I'm just a whisper. Who am I?
Telemetry for feature health
Why do Radio Buttons not fill the entire outer circle?
How to preserve electronics (computers, iPads and phones) for hundreds of years
Grepping string, but include all non-blank lines following each grep match
How can I, as DM, avoid the Conga Line of Death occurring when implementing some form of flanking rule?
Anime with legendary swords made from talismans and a man who could change them with a shattered body
Is there anyway, I can have two passwords for my wi-fi
Why is the sun approximated as a black body at ~ 5800 K?
Typing CO_2 easily
Storage of electrolytic capacitors - how long?
Origin of pigs as a species
In One Punch Man, is King actually weak?
Isometric embedding of a genus g surface
How to make money from a browser who sees 5 seconds into the future of any web page?
Pre-Employment Background Check With Consent For Future Checks
Confusion over Hunter with Crossbow Expert and Giant Killer
Identifying "long and narrow" polygons in with PostGIS
Personal or impersonal in a technical resume
Why does the Persian emissary display a string of crowned skulls?
PTIJ: does fasting on Ta'anis Esther give us reward as if we celebrated 2 Purims? (similar to Yom Kippur)
What is this high flying aircraft over Pennsylvania?
pandas convert columns to percentages of the totals
Python Pandas - Convert column to percentage on Groupby DFConverting string into datetimeWhy is reading lines from stdin much slower in C++ than Python?Selecting multiple columns in a pandas dataframeRenaming columns in pandasDelete column from pandas DataFrame by column name“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasConvert Python dict into a dataframeImport multiple csv files into pandas and concatenate into one DataFramePandas percentage of total with groupby
I have a dataframe with 4 columns an ID and three categories that results fell into
<80% 80-90 >90
id
1 2 4 4
2 3 6 1
3 7 0 3
I would like to convert it to percentages ie:
<80% 80-90 >90
id
1 20% 40% 40%
2 30% 60% 10%
3 70% 0% 30%
this seems like it should be within pandas capabilities but I just can't figure it out.
Thanks in advance!
python pandas
add a comment |
I have a dataframe with 4 columns an ID and three categories that results fell into
<80% 80-90 >90
id
1 2 4 4
2 3 6 1
3 7 0 3
I would like to convert it to percentages ie:
<80% 80-90 >90
id
1 20% 40% 40%
2 30% 60% 10%
3 70% 0% 30%
this seems like it should be within pandas capabilities but I just can't figure it out.
Thanks in advance!
python pandas
1
Please provide an example dataframe, your numbers are a bit hard to interpret at first glace.
– instant
Feb 2 '17 at 15:41
I'm not sure how to post the dataframe and I appologize my example lost its format but I have an index of ID and colums for <80%, 80%-90% and >90%. then I have data in the rows so row 0 may be iindex 1 with [3, 4,3] . I would like row 0 index 1 to have 30%, 40%, 30%. I am very new to pandas sorry i am still explaining it poorly.
– DTATSO
Feb 2 '17 at 15:56
I guess it actually looks more like this: results <80%, 80%-90%, >90% id 1 3 4 3 2 7 3 0 and I want: results <80%, 80%-90%, >90% id 1 30% 40% 30% 2 70% 30% 0%
– DTATSO
Feb 2 '17 at 15:57
add a comment |
I have a dataframe with 4 columns an ID and three categories that results fell into
<80% 80-90 >90
id
1 2 4 4
2 3 6 1
3 7 0 3
I would like to convert it to percentages ie:
<80% 80-90 >90
id
1 20% 40% 40%
2 30% 60% 10%
3 70% 0% 30%
this seems like it should be within pandas capabilities but I just can't figure it out.
Thanks in advance!
python pandas
I have a dataframe with 4 columns an ID and three categories that results fell into
<80% 80-90 >90
id
1 2 4 4
2 3 6 1
3 7 0 3
I would like to convert it to percentages ie:
<80% 80-90 >90
id
1 20% 40% 40%
2 30% 60% 10%
3 70% 0% 30%
this seems like it should be within pandas capabilities but I just can't figure it out.
Thanks in advance!
python pandas
python pandas
edited Feb 2 '17 at 16:02
DTATSO
asked Feb 2 '17 at 15:39
DTATSODTATSO
5215
5215
1
Please provide an example dataframe, your numbers are a bit hard to interpret at first glace.
– instant
Feb 2 '17 at 15:41
I'm not sure how to post the dataframe and I appologize my example lost its format but I have an index of ID and colums for <80%, 80%-90% and >90%. then I have data in the rows so row 0 may be iindex 1 with [3, 4,3] . I would like row 0 index 1 to have 30%, 40%, 30%. I am very new to pandas sorry i am still explaining it poorly.
– DTATSO
Feb 2 '17 at 15:56
I guess it actually looks more like this: results <80%, 80%-90%, >90% id 1 3 4 3 2 7 3 0 and I want: results <80%, 80%-90%, >90% id 1 30% 40% 30% 2 70% 30% 0%
– DTATSO
Feb 2 '17 at 15:57
add a comment |
1
Please provide an example dataframe, your numbers are a bit hard to interpret at first glace.
– instant
Feb 2 '17 at 15:41
I'm not sure how to post the dataframe and I appologize my example lost its format but I have an index of ID and colums for <80%, 80%-90% and >90%. then I have data in the rows so row 0 may be iindex 1 with [3, 4,3] . I would like row 0 index 1 to have 30%, 40%, 30%. I am very new to pandas sorry i am still explaining it poorly.
– DTATSO
Feb 2 '17 at 15:56
I guess it actually looks more like this: results <80%, 80%-90%, >90% id 1 3 4 3 2 7 3 0 and I want: results <80%, 80%-90%, >90% id 1 30% 40% 30% 2 70% 30% 0%
– DTATSO
Feb 2 '17 at 15:57
1
1
Please provide an example dataframe, your numbers are a bit hard to interpret at first glace.
– instant
Feb 2 '17 at 15:41
Please provide an example dataframe, your numbers are a bit hard to interpret at first glace.
– instant
Feb 2 '17 at 15:41
I'm not sure how to post the dataframe and I appologize my example lost its format but I have an index of ID and colums for <80%, 80%-90% and >90%. then I have data in the rows so row 0 may be iindex 1 with [3, 4,3] . I would like row 0 index 1 to have 30%, 40%, 30%. I am very new to pandas sorry i am still explaining it poorly.
– DTATSO
Feb 2 '17 at 15:56
I'm not sure how to post the dataframe and I appologize my example lost its format but I have an index of ID and colums for <80%, 80%-90% and >90%. then I have data in the rows so row 0 may be iindex 1 with [3, 4,3] . I would like row 0 index 1 to have 30%, 40%, 30%. I am very new to pandas sorry i am still explaining it poorly.
– DTATSO
Feb 2 '17 at 15:56
I guess it actually looks more like this: results <80%, 80%-90%, >90% id 1 3 4 3 2 7 3 0 and I want: results <80%, 80%-90%, >90% id 1 30% 40% 30% 2 70% 30% 0%
– DTATSO
Feb 2 '17 at 15:57
I guess it actually looks more like this: results <80%, 80%-90%, >90% id 1 3 4 3 2 7 3 0 and I want: results <80%, 80%-90%, >90% id 1 30% 40% 30% 2 70% 30% 0%
– DTATSO
Feb 2 '17 at 15:57
add a comment |
1 Answer
1
active
oldest
votes
You can do this using basic pandas operators .div and .sum, using the axis argument to make sure the calculations happen the way you want:
cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
- Calculate the sum of each column (
df[cols].sum(axis=1).axis=1makes the summation occur across the rows, rather than down the columns. - Divide the dataframe by the resulting series (
df[cols].div(df[cols].sum(axis=1), axis=0).axis=0makes the division happen across the columns. - To finish, multiply the results by
100so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
Thank you so much this did the trick. Thanks for explaining the portions as well. Pandas seems to be a great tool that hopefully I'll get better at soon.
– DTATSO
Feb 2 '17 at 16:23
"Proportions" are percentages.0.1IS10%. The%is basically a "divide by 100" operator. Putting100there is wrong and will probably lead to all kinds of errors down the line.
– Jan Christoph Terasa
Feb 2 '17 at 16:25
@ChristophTerasa I'm not sure I follow. I understand that you can express the same value as 0.1 or as 10%, but the OP asked for the latter. Whether or not that leads to problems down the line depends on the OP's use case - maybe it needs to be in % format for some reason.
– ASGM
Feb 2 '17 at 16:38
OK, maybe "wrong" is not the correct word here. Your solution certainly is correct. I should probably say it is a dangerous paradigm. You cannot store10%in a DataFrame/array, but you can store0.1.
– Jan Christoph Terasa
Feb 2 '17 at 16:52
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f42006346%2fpandas-convert-columns-to-percentages-of-the-totals%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can do this using basic pandas operators .div and .sum, using the axis argument to make sure the calculations happen the way you want:
cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
- Calculate the sum of each column (
df[cols].sum(axis=1).axis=1makes the summation occur across the rows, rather than down the columns. - Divide the dataframe by the resulting series (
df[cols].div(df[cols].sum(axis=1), axis=0).axis=0makes the division happen across the columns. - To finish, multiply the results by
100so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
Thank you so much this did the trick. Thanks for explaining the portions as well. Pandas seems to be a great tool that hopefully I'll get better at soon.
– DTATSO
Feb 2 '17 at 16:23
"Proportions" are percentages.0.1IS10%. The%is basically a "divide by 100" operator. Putting100there is wrong and will probably lead to all kinds of errors down the line.
– Jan Christoph Terasa
Feb 2 '17 at 16:25
@ChristophTerasa I'm not sure I follow. I understand that you can express the same value as 0.1 or as 10%, but the OP asked for the latter. Whether or not that leads to problems down the line depends on the OP's use case - maybe it needs to be in % format for some reason.
– ASGM
Feb 2 '17 at 16:38
OK, maybe "wrong" is not the correct word here. Your solution certainly is correct. I should probably say it is a dangerous paradigm. You cannot store10%in a DataFrame/array, but you can store0.1.
– Jan Christoph Terasa
Feb 2 '17 at 16:52
add a comment |
You can do this using basic pandas operators .div and .sum, using the axis argument to make sure the calculations happen the way you want:
cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
- Calculate the sum of each column (
df[cols].sum(axis=1).axis=1makes the summation occur across the rows, rather than down the columns. - Divide the dataframe by the resulting series (
df[cols].div(df[cols].sum(axis=1), axis=0).axis=0makes the division happen across the columns. - To finish, multiply the results by
100so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
Thank you so much this did the trick. Thanks for explaining the portions as well. Pandas seems to be a great tool that hopefully I'll get better at soon.
– DTATSO
Feb 2 '17 at 16:23
"Proportions" are percentages.0.1IS10%. The%is basically a "divide by 100" operator. Putting100there is wrong and will probably lead to all kinds of errors down the line.
– Jan Christoph Terasa
Feb 2 '17 at 16:25
@ChristophTerasa I'm not sure I follow. I understand that you can express the same value as 0.1 or as 10%, but the OP asked for the latter. Whether or not that leads to problems down the line depends on the OP's use case - maybe it needs to be in % format for some reason.
– ASGM
Feb 2 '17 at 16:38
OK, maybe "wrong" is not the correct word here. Your solution certainly is correct. I should probably say it is a dangerous paradigm. You cannot store10%in a DataFrame/array, but you can store0.1.
– Jan Christoph Terasa
Feb 2 '17 at 16:52
add a comment |
You can do this using basic pandas operators .div and .sum, using the axis argument to make sure the calculations happen the way you want:
cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
- Calculate the sum of each column (
df[cols].sum(axis=1).axis=1makes the summation occur across the rows, rather than down the columns. - Divide the dataframe by the resulting series (
df[cols].div(df[cols].sum(axis=1), axis=0).axis=0makes the division happen across the columns. - To finish, multiply the results by
100so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
You can do this using basic pandas operators .div and .sum, using the axis argument to make sure the calculations happen the way you want:
cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
- Calculate the sum of each column (
df[cols].sum(axis=1).axis=1makes the summation occur across the rows, rather than down the columns. - Divide the dataframe by the resulting series (
df[cols].div(df[cols].sum(axis=1), axis=0).axis=0makes the division happen across the columns. - To finish, multiply the results by
100so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
edited Mar 7 at 2:51
answered Feb 2 '17 at 15:57
ASGMASGM
6,9511740
6,9511740
Thank you so much this did the trick. Thanks for explaining the portions as well. Pandas seems to be a great tool that hopefully I'll get better at soon.
– DTATSO
Feb 2 '17 at 16:23
"Proportions" are percentages.0.1IS10%. The%is basically a "divide by 100" operator. Putting100there is wrong and will probably lead to all kinds of errors down the line.
– Jan Christoph Terasa
Feb 2 '17 at 16:25
@ChristophTerasa I'm not sure I follow. I understand that you can express the same value as 0.1 or as 10%, but the OP asked for the latter. Whether or not that leads to problems down the line depends on the OP's use case - maybe it needs to be in % format for some reason.
– ASGM
Feb 2 '17 at 16:38
OK, maybe "wrong" is not the correct word here. Your solution certainly is correct. I should probably say it is a dangerous paradigm. You cannot store10%in a DataFrame/array, but you can store0.1.
– Jan Christoph Terasa
Feb 2 '17 at 16:52
add a comment |
Thank you so much this did the trick. Thanks for explaining the portions as well. Pandas seems to be a great tool that hopefully I'll get better at soon.
– DTATSO
Feb 2 '17 at 16:23
"Proportions" are percentages.0.1IS10%. The%is basically a "divide by 100" operator. Putting100there is wrong and will probably lead to all kinds of errors down the line.
– Jan Christoph Terasa
Feb 2 '17 at 16:25
@ChristophTerasa I'm not sure I follow. I understand that you can express the same value as 0.1 or as 10%, but the OP asked for the latter. Whether or not that leads to problems down the line depends on the OP's use case - maybe it needs to be in % format for some reason.
– ASGM
Feb 2 '17 at 16:38
OK, maybe "wrong" is not the correct word here. Your solution certainly is correct. I should probably say it is a dangerous paradigm. You cannot store10%in a DataFrame/array, but you can store0.1.
– Jan Christoph Terasa
Feb 2 '17 at 16:52
Thank you so much this did the trick. Thanks for explaining the portions as well. Pandas seems to be a great tool that hopefully I'll get better at soon.
– DTATSO
Feb 2 '17 at 16:23
Thank you so much this did the trick. Thanks for explaining the portions as well. Pandas seems to be a great tool that hopefully I'll get better at soon.
– DTATSO
Feb 2 '17 at 16:23
"Proportions" are percentages.
0.1 IS 10%. The % is basically a "divide by 100" operator. Putting 100 there is wrong and will probably lead to all kinds of errors down the line.– Jan Christoph Terasa
Feb 2 '17 at 16:25
"Proportions" are percentages.
0.1 IS 10%. The % is basically a "divide by 100" operator. Putting 100 there is wrong and will probably lead to all kinds of errors down the line.– Jan Christoph Terasa
Feb 2 '17 at 16:25
@ChristophTerasa I'm not sure I follow. I understand that you can express the same value as 0.1 or as 10%, but the OP asked for the latter. Whether or not that leads to problems down the line depends on the OP's use case - maybe it needs to be in % format for some reason.
– ASGM
Feb 2 '17 at 16:38
@ChristophTerasa I'm not sure I follow. I understand that you can express the same value as 0.1 or as 10%, but the OP asked for the latter. Whether or not that leads to problems down the line depends on the OP's use case - maybe it needs to be in % format for some reason.
– ASGM
Feb 2 '17 at 16:38
OK, maybe "wrong" is not the correct word here. Your solution certainly is correct. I should probably say it is a dangerous paradigm. You cannot store
10% in a DataFrame/array, but you can store 0.1.– Jan Christoph Terasa
Feb 2 '17 at 16:52
OK, maybe "wrong" is not the correct word here. Your solution certainly is correct. I should probably say it is a dangerous paradigm. You cannot store
10% in a DataFrame/array, but you can store 0.1.– Jan Christoph Terasa
Feb 2 '17 at 16:52
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f42006346%2fpandas-convert-columns-to-percentages-of-the-totals%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Please provide an example dataframe, your numbers are a bit hard to interpret at first glace.
– instant
Feb 2 '17 at 15:41
I'm not sure how to post the dataframe and I appologize my example lost its format but I have an index of ID and colums for <80%, 80%-90% and >90%. then I have data in the rows so row 0 may be iindex 1 with [3, 4,3] . I would like row 0 index 1 to have 30%, 40%, 30%. I am very new to pandas sorry i am still explaining it poorly.
– DTATSO
Feb 2 '17 at 15:56
I guess it actually looks more like this: results <80%, 80%-90%, >90% id 1 3 4 3 2 7 3 0 and I want: results <80%, 80%-90%, >90% id 1 30% 40% 30% 2 70% 30% 0%
– DTATSO
Feb 2 '17 at 15:57