skip to content
Ben Lau statistics . machine learning . programming . optimization . research

Log Transformation

1 min read Updated:
  • E(Y)eE(lnY)E(Y) \leq e^E(lnY) because of Jensen’s inequality. In other words, E(exp(ϵ))>0E(exp(\epsilon)) > 0
  • But median of YY is the same as exp(E(lnY))exp(E(lnY)) because expexp is a monotonic function that the order is preserved.
  • After taking log on Y, we might want to take log on some X if we assume linear relationship between X and Y, because transforming one without the other will introduce severe nonlinear curvature, badly violating the linearity assumption.
  • The coefficient of X has a special interpretation in the log-log model, called elasticity. This parameter measures the percent increase in the median of the distribution of the untransformed Y variable corresponding to a small percent increase in the untransformed X variable. A close enough interpretation would be “There is a β1\beta_1% increase in the median of Y associated with a 1% increase in X”.