Python Forum
How to run a linear model by group in Python? - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: How to run a linear model by group in Python? (/thread-40951.html)



How to run a linear model by group in Python? - Betty775522 - Oct-18-2023

Dear all,
I’m beginning with Python that I need to use to run a linear model for the dataset below :
Output:
Location Y X1 X2 1 32 1 1 1 44 1 2 1 58 1 3 1 76 2 1 1 73 2 2 1 37 2 3 1 52 3 1 1 78 3 2 1 60 3 3 2 93 1 1 2 78 1 2 2 25 1 3 2 97 2 1 2 85 2 2 2 60 2 3 2 70 3 1 2 62 3 2 2 95 3 3
My target is to run a linear model as follows :
Y ~ X1 + X2
X1 and X2 are categorical variables
And for that the following code gave me exactly what I need :
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
from statsmodels.formula.api import ols
import scipy.stats as stats

df = pd.DataFrame(dataset)
reg = ols('Y ~ C(X1) + C(X2)', data=df).fit() 
df['fitted_values'] = reg.fittedvalues
result = reg.outlier_test()
df['student_resid'] = result.student_resid
What I’m not able to do is to run this code by ‘Location’, and get my columns 'fitted_values' and 'student_resid' accordingly.
Any help is highly appreciated.
Thanks a lot in advance.