Multiple least squares quadratic fit in ggplot Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Adding a regression line on a ggplotHow to sort a dataframe by multiple column(s)add lines based on fitted values from lme to faceted ggplot in Rgaussian smoother in rPiecewise regression with a quadratic polynomial and a straight line joining smoothly at a break pointPlotting predicted survival curves for continuous covariates in ggplotFitting a quadratic curve in ggplotCurve-fitting with nls() in RFitting a curve in R to an equationManual Control of R legend with ggplot2Estimating logsitic parameters and random effects with nlme

Why is it faster to reheat something than it is to cook it?

Printing attributes of selection in ArcPy?

How much damage would a cupful of neutron star matter do to the Earth?

Why datecode is SO IMPORTANT to chip manufacturers?

How to change the tick of the color bar legend to black

Central Vacuuming: Is it worth it, and how does it compare to normal vacuuming?

Tannaka duality for semisimple groups

Delete free apps from library

NERDTreeMenu Remapping

Project Euler #1 in C++

Flight departed from the gate 5 min before scheduled departure time. Refund options

Is CEO the "profession" with the most psychopaths?

What is the "studentd" process?

Would color changing eyes affect vision?

Why weren't discrete x86 CPUs ever used in game hardware?

Are the endpoints of the domain of a function counted as critical points?

Select every other edge (they share a common vertex)

What is the difference between a "ranged attack" and a "ranged weapon attack"?

Tips to organize LaTeX presentations for a semester

How does light 'choose' between wave and particle behaviour?

Does any scripture mention that forms of God or Goddess are symbolic?

Did any compiler fully use 80-bit floating point?

What initially awakened the Balrog?

Test print coming out spongy



Multiple least squares quadratic fit in ggplot



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!Adding a regression line on a ggplotHow to sort a dataframe by multiple column(s)add lines based on fitted values from lme to faceted ggplot in Rgaussian smoother in rPiecewise regression with a quadratic polynomial and a straight line joining smoothly at a break pointPlotting predicted survival curves for continuous covariates in ggplotFitting a quadratic curve in ggplotCurve-fitting with nls() in RFitting a curve in R to an equationManual Control of R legend with ggplot2Estimating logsitic parameters and random effects with nlme



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1
















Notice that your graphic constructed from Problem 4 shows a quadratic
or curved relationship between log_wages against exp. The next
task is to plot three quadratic functions for each race level "black",
"white" and "other". To estimate the quadratic fit, you can use the
following function quad_fit:




```r
quad_fit <- function(data_sub)
return(lm(log_wage~exp+I(exp^2),data=data_sub)$coefficients)

quad_fit(salary_data)
```



The above function computes the least squares quadratic fit and
returns coefficients a1, a2, a3, where



Y(hat) = a1 + a2x + a3x^2



where Y(hat) = log(wage) and x = exp



Use ggplot to accomplish this task or use base R graphics for
partial credit. Make sure to include a legend and appropriate labels.




My attempt



blackfit <- quad_fit(salary_data[salary_data$race == "black",])
whitefit <- quad_fit(salary_data[salary_data$race == "white",])
otherfit <- quad_fit(salary_data[salary_data$race == "other",])

yblack <- blackfit[1] + blackfit[2]*salary_data$exp + blackfit[3]*(salary_data$exp)^2
ywhite <- whitefit[1] + whitefit[2]*salary_data$exp + whitefit[3]*(salary_data$exp)^2
yother <- otherfit[1] + otherfit[2]*salary_data$exp + otherfit[3]*(salary_data$exp)^2

soloblack <- salary_data[salary_data$race == "black",]
solowhite <- salary_data[salary_data$race == "white",]
soloother <- salary_data[salary_data$race == "other",]

ggplot(data = soloblack) +
geom_point(aes(x = exp, y = log_wage)) +
stat_smooth(aes(y = log_wage, x = exp), formula = y ~ yblack)



This is only the first attempt for the data filtered with for race == "black".
I am not clear how the formula should look like because through the quad_fit function it seems it already does the calculations for you.










share|improve this question






















  • Where is your graphic constructed from Problem 4? Simply pass in subsets to that process.

    – Parfait
    Mar 9 at 1:18











  • library(ggplot2) ggplot(data = salary_data) + geom_point(aes(x = exp, y = log_wage, alpha = exp)) + labs(x = "Job Experience", y = "Log of Wage", title = "Salary Dataset") This is the code for the graph, but I feel my arguments in geom_smooth are wrong.

    – Dario Gentiletti
    Mar 9 at 5:07

















1
















Notice that your graphic constructed from Problem 4 shows a quadratic
or curved relationship between log_wages against exp. The next
task is to plot three quadratic functions for each race level "black",
"white" and "other". To estimate the quadratic fit, you can use the
following function quad_fit:




```r
quad_fit <- function(data_sub)
return(lm(log_wage~exp+I(exp^2),data=data_sub)$coefficients)

quad_fit(salary_data)
```



The above function computes the least squares quadratic fit and
returns coefficients a1, a2, a3, where



Y(hat) = a1 + a2x + a3x^2



where Y(hat) = log(wage) and x = exp



Use ggplot to accomplish this task or use base R graphics for
partial credit. Make sure to include a legend and appropriate labels.




My attempt



blackfit <- quad_fit(salary_data[salary_data$race == "black",])
whitefit <- quad_fit(salary_data[salary_data$race == "white",])
otherfit <- quad_fit(salary_data[salary_data$race == "other",])

yblack <- blackfit[1] + blackfit[2]*salary_data$exp + blackfit[3]*(salary_data$exp)^2
ywhite <- whitefit[1] + whitefit[2]*salary_data$exp + whitefit[3]*(salary_data$exp)^2
yother <- otherfit[1] + otherfit[2]*salary_data$exp + otherfit[3]*(salary_data$exp)^2

soloblack <- salary_data[salary_data$race == "black",]
solowhite <- salary_data[salary_data$race == "white",]
soloother <- salary_data[salary_data$race == "other",]

ggplot(data = soloblack) +
geom_point(aes(x = exp, y = log_wage)) +
stat_smooth(aes(y = log_wage, x = exp), formula = y ~ yblack)



This is only the first attempt for the data filtered with for race == "black".
I am not clear how the formula should look like because through the quad_fit function it seems it already does the calculations for you.










share|improve this question






















  • Where is your graphic constructed from Problem 4? Simply pass in subsets to that process.

    – Parfait
    Mar 9 at 1:18











  • library(ggplot2) ggplot(data = salary_data) + geom_point(aes(x = exp, y = log_wage, alpha = exp)) + labs(x = "Job Experience", y = "Log of Wage", title = "Salary Dataset") This is the code for the graph, but I feel my arguments in geom_smooth are wrong.

    – Dario Gentiletti
    Mar 9 at 5:07













1












1








1









Notice that your graphic constructed from Problem 4 shows a quadratic
or curved relationship between log_wages against exp. The next
task is to plot three quadratic functions for each race level "black",
"white" and "other". To estimate the quadratic fit, you can use the
following function quad_fit:




```r
quad_fit <- function(data_sub)
return(lm(log_wage~exp+I(exp^2),data=data_sub)$coefficients)

quad_fit(salary_data)
```



The above function computes the least squares quadratic fit and
returns coefficients a1, a2, a3, where



Y(hat) = a1 + a2x + a3x^2



where Y(hat) = log(wage) and x = exp



Use ggplot to accomplish this task or use base R graphics for
partial credit. Make sure to include a legend and appropriate labels.




My attempt



blackfit <- quad_fit(salary_data[salary_data$race == "black",])
whitefit <- quad_fit(salary_data[salary_data$race == "white",])
otherfit <- quad_fit(salary_data[salary_data$race == "other",])

yblack <- blackfit[1] + blackfit[2]*salary_data$exp + blackfit[3]*(salary_data$exp)^2
ywhite <- whitefit[1] + whitefit[2]*salary_data$exp + whitefit[3]*(salary_data$exp)^2
yother <- otherfit[1] + otherfit[2]*salary_data$exp + otherfit[3]*(salary_data$exp)^2

soloblack <- salary_data[salary_data$race == "black",]
solowhite <- salary_data[salary_data$race == "white",]
soloother <- salary_data[salary_data$race == "other",]

ggplot(data = soloblack) +
geom_point(aes(x = exp, y = log_wage)) +
stat_smooth(aes(y = log_wage, x = exp), formula = y ~ yblack)



This is only the first attempt for the data filtered with for race == "black".
I am not clear how the formula should look like because through the quad_fit function it seems it already does the calculations for you.










share|improve this question















Notice that your graphic constructed from Problem 4 shows a quadratic
or curved relationship between log_wages against exp. The next
task is to plot three quadratic functions for each race level "black",
"white" and "other". To estimate the quadratic fit, you can use the
following function quad_fit:




```r
quad_fit <- function(data_sub)
return(lm(log_wage~exp+I(exp^2),data=data_sub)$coefficients)

quad_fit(salary_data)
```



The above function computes the least squares quadratic fit and
returns coefficients a1, a2, a3, where



Y(hat) = a1 + a2x + a3x^2



where Y(hat) = log(wage) and x = exp



Use ggplot to accomplish this task or use base R graphics for
partial credit. Make sure to include a legend and appropriate labels.




My attempt



blackfit <- quad_fit(salary_data[salary_data$race == "black",])
whitefit <- quad_fit(salary_data[salary_data$race == "white",])
otherfit <- quad_fit(salary_data[salary_data$race == "other",])

yblack <- blackfit[1] + blackfit[2]*salary_data$exp + blackfit[3]*(salary_data$exp)^2
ywhite <- whitefit[1] + whitefit[2]*salary_data$exp + whitefit[3]*(salary_data$exp)^2
yother <- otherfit[1] + otherfit[2]*salary_data$exp + otherfit[3]*(salary_data$exp)^2

soloblack <- salary_data[salary_data$race == "black",]
solowhite <- salary_data[salary_data$race == "white",]
soloother <- salary_data[salary_data$race == "other",]

ggplot(data = soloblack) +
geom_point(aes(x = exp, y = log_wage)) +
stat_smooth(aes(y = log_wage, x = exp), formula = y ~ yblack)



This is only the first attempt for the data filtered with for race == "black".
I am not clear how the formula should look like because through the quad_fit function it seems it already does the calculations for you.







r ggplot2 quadratic-curve






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 8 at 22:49









Dario GentilettiDario Gentiletti

61




61












  • Where is your graphic constructed from Problem 4? Simply pass in subsets to that process.

    – Parfait
    Mar 9 at 1:18











  • library(ggplot2) ggplot(data = salary_data) + geom_point(aes(x = exp, y = log_wage, alpha = exp)) + labs(x = "Job Experience", y = "Log of Wage", title = "Salary Dataset") This is the code for the graph, but I feel my arguments in geom_smooth are wrong.

    – Dario Gentiletti
    Mar 9 at 5:07

















  • Where is your graphic constructed from Problem 4? Simply pass in subsets to that process.

    – Parfait
    Mar 9 at 1:18











  • library(ggplot2) ggplot(data = salary_data) + geom_point(aes(x = exp, y = log_wage, alpha = exp)) + labs(x = "Job Experience", y = "Log of Wage", title = "Salary Dataset") This is the code for the graph, but I feel my arguments in geom_smooth are wrong.

    – Dario Gentiletti
    Mar 9 at 5:07
















Where is your graphic constructed from Problem 4? Simply pass in subsets to that process.

– Parfait
Mar 9 at 1:18





Where is your graphic constructed from Problem 4? Simply pass in subsets to that process.

– Parfait
Mar 9 at 1:18













library(ggplot2) ggplot(data = salary_data) + geom_point(aes(x = exp, y = log_wage, alpha = exp)) + labs(x = "Job Experience", y = "Log of Wage", title = "Salary Dataset") This is the code for the graph, but I feel my arguments in geom_smooth are wrong.

– Dario Gentiletti
Mar 9 at 5:07





library(ggplot2) ggplot(data = salary_data) + geom_point(aes(x = exp, y = log_wage, alpha = exp)) + labs(x = "Job Experience", y = "Log of Wage", title = "Salary Dataset") This is the code for the graph, but I feel my arguments in geom_smooth are wrong.

– Dario Gentiletti
Mar 9 at 5:07












1 Answer
1






active

oldest

votes


















1














Consider plotting fitted values using output of quad_fit (as shown by @StefanK here) and use by to plot across all distinct values of race:



reg_plot <- function(sub) 
# PREDICTED DATA FOR LINE PLOT
q_fit <- quad_fit(sub)
predicted_df <- data.frame(wage_pred = predict(q_fit, sub), exp = sub$exp)

# ORIGINAL SCATTER PLOT WITH PREDICTED LINE
ggplot(data = sub) +
geom_point(aes(x = exp, y = log_wage, alpha = exp)) +
labs(x = "Job Experience", y = "Log of Wage",
title = paste("Wage and Job Experience Plot for",
sub$race[[1]], "in Salary Dataset")
geom_line(color='red', data = predicted_df, aes(x = exp, y = wage_pred))


# RUN GRAPHS FOR EACH RACE
by(salary_data, salary_data$race, reg_plot)





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55072087%2fmultiple-least-squares-quadratic-fit-in-ggplot%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    Consider plotting fitted values using output of quad_fit (as shown by @StefanK here) and use by to plot across all distinct values of race:



    reg_plot <- function(sub) 
    # PREDICTED DATA FOR LINE PLOT
    q_fit <- quad_fit(sub)
    predicted_df <- data.frame(wage_pred = predict(q_fit, sub), exp = sub$exp)

    # ORIGINAL SCATTER PLOT WITH PREDICTED LINE
    ggplot(data = sub) +
    geom_point(aes(x = exp, y = log_wage, alpha = exp)) +
    labs(x = "Job Experience", y = "Log of Wage",
    title = paste("Wage and Job Experience Plot for",
    sub$race[[1]], "in Salary Dataset")
    geom_line(color='red', data = predicted_df, aes(x = exp, y = wage_pred))


    # RUN GRAPHS FOR EACH RACE
    by(salary_data, salary_data$race, reg_plot)





    share|improve this answer



























      1














      Consider plotting fitted values using output of quad_fit (as shown by @StefanK here) and use by to plot across all distinct values of race:



      reg_plot <- function(sub) 
      # PREDICTED DATA FOR LINE PLOT
      q_fit <- quad_fit(sub)
      predicted_df <- data.frame(wage_pred = predict(q_fit, sub), exp = sub$exp)

      # ORIGINAL SCATTER PLOT WITH PREDICTED LINE
      ggplot(data = sub) +
      geom_point(aes(x = exp, y = log_wage, alpha = exp)) +
      labs(x = "Job Experience", y = "Log of Wage",
      title = paste("Wage and Job Experience Plot for",
      sub$race[[1]], "in Salary Dataset")
      geom_line(color='red', data = predicted_df, aes(x = exp, y = wage_pred))


      # RUN GRAPHS FOR EACH RACE
      by(salary_data, salary_data$race, reg_plot)





      share|improve this answer

























        1












        1








        1







        Consider plotting fitted values using output of quad_fit (as shown by @StefanK here) and use by to plot across all distinct values of race:



        reg_plot <- function(sub) 
        # PREDICTED DATA FOR LINE PLOT
        q_fit <- quad_fit(sub)
        predicted_df <- data.frame(wage_pred = predict(q_fit, sub), exp = sub$exp)

        # ORIGINAL SCATTER PLOT WITH PREDICTED LINE
        ggplot(data = sub) +
        geom_point(aes(x = exp, y = log_wage, alpha = exp)) +
        labs(x = "Job Experience", y = "Log of Wage",
        title = paste("Wage and Job Experience Plot for",
        sub$race[[1]], "in Salary Dataset")
        geom_line(color='red', data = predicted_df, aes(x = exp, y = wage_pred))


        # RUN GRAPHS FOR EACH RACE
        by(salary_data, salary_data$race, reg_plot)





        share|improve this answer













        Consider plotting fitted values using output of quad_fit (as shown by @StefanK here) and use by to plot across all distinct values of race:



        reg_plot <- function(sub) 
        # PREDICTED DATA FOR LINE PLOT
        q_fit <- quad_fit(sub)
        predicted_df <- data.frame(wage_pred = predict(q_fit, sub), exp = sub$exp)

        # ORIGINAL SCATTER PLOT WITH PREDICTED LINE
        ggplot(data = sub) +
        geom_point(aes(x = exp, y = log_wage, alpha = exp)) +
        labs(x = "Job Experience", y = "Log of Wage",
        title = paste("Wage and Job Experience Plot for",
        sub$race[[1]], "in Salary Dataset")
        geom_line(color='red', data = predicted_df, aes(x = exp, y = wage_pred))


        # RUN GRAPHS FOR EACH RACE
        by(salary_data, salary_data$race, reg_plot)






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 9 at 18:27









        ParfaitParfait

        54.5k104872




        54.5k104872





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55072087%2fmultiple-least-squares-quadratic-fit-in-ggplot%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            AWS Lex not identifying response if by a variable The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceEnforcing custom enumeration in AWS LEX for slot valuesHow to give response based on user response in Amazon Lex?Intercepting AWS Lambda Response to a AWS Lex QueryLex chat bot error: Reached second execution of fulfillment lambda on the same utteranceamazon lex showing invalid responseLambda response send back to Lex slot?Response card in Amazon lexAmazon Lex - Lambda response return HTML to botHow can I solve 424 (Failed Dependency) (python) obtained from Amazon lex?

            Алба-Юлія

            Захаров Федір Захарович