I have come across a number of articles that test on the Kuznets curve. They are using GDP and log squared GDP as independent variables for their research. I just don't get why they have to create a brand new independent variable that is the "log squared GDP." What is the purpose of this? Why cant I just interpret from the GDP value itself? For example: GDP coefficient is negative, I would just interpret it as a negative relationship between inequality and GDP. Why the addition of log squared GDP?
There are several statistical reasons why you would covert the variable into a logarithm. Logarithm transformation, for example, can change a highly skewed variable into a more normalized distribution. It is also useful when you wish to linearize the relationship. Because GDP per capita and income inequality fall under the above criteria, they are frequently approximated in log form.
Due to the inverted U-shape of the Kuznets curve, one approach for testing it is to design a model with a polynomial effect, which is adding log squared GDP ((log GPD)^2).