Python Forum
Excel iterations and restrictions - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Excel iterations and restrictions (/thread-41819.html)



Excel iterations and restrictions - marialage98 - Mar-23-2024

Hello, i have an excel file that has columns with circular references. And i want to use python to apply restrictons to those columns.
I have a column 'Maquina', a column date and a Column with the values i want to change.
I also have the max for each machine and the max overall ('max refinaria')
The first thing i want to do is see if the sum of the values, grouped by machine and by date is greater than the max for the machine. If it is, i want the difference to be pasted to the next day.
I want this restrinction to be applied to the excel file and to allow the excel iterations to change the other column values that are dependent on this ones.
I have this code, but its not working:
im
port pandas as pd
import xlrd
import openpyxl

data = pd.read_excel('Supermercado.xlsx', sheet_name=4, engine='openpyxl')

# data2 = pd.read_excel('Supermercado.xlsx', sheet_name=6)



maquinas = []


print(type(data.iloc[0][2]))

lines = (data.shape[0])


for i in range (lines):
    if data.iloc[i][0] not in maquinas:
        maquinas.append(data.loc[i][0])

grouped = data.groupby([data.columns[0], data.columns[2]])

summed_values = grouped[data.columns[18]].sum()

# Retrieve the value in the 22nd column for each group
# Since you want to retrieve a single value for each group, you might consider taking the first value
# assuming that the 22nd column contains the same value for each machine on a given day
max_maquina = grouped[data.columns[23]].first()

max_refinaria = 500000
# Combine the summed values and the 22nd column values into a single DataFrame
result_df = pd.DataFrame({
    'Machine': summed_values.index.get_level_values(0),
    'Date': summed_values.index.get_level_values(1),
    'Sum': summed_values.values,
    'Max Maquina': max_maquina.values,
    'Max Refinaria': max_refinaria
})

# Calculate the difference between 'Max Maquina' and 'Sum'
result_df['Difference'] = result_df['Max Maquina'] - result_df['Sum']

# Iterate through the DataFrame using the iterrows() function
for index, row in result_df.iterrows():
    if row['Difference'] < 0:
        # Find the index for the next day for the same machine
        next_day_index = result_df[(result_df['Machine'] == row['Machine']) & (result_df['Date'] > row['Date'])].index.min()
        
        if pd.notnull(next_day_index):
            # Add the absolute value of the negative difference to the next day's 'Sum'
            result_df.at[next_day_index, 'Sum'] += abs(row['Difference'])
            
            # Recalculate the difference for the next day
            result_df.at[next_day_index, 'Difference'] = result_df.at[next_day_index, 'Max Maquina'] - result_df.at[next_day_index, 'Sum']
            
            # Update the original Excel sheet's value
            # You need to find the corresponding row in the original Excel sheet
            # This requires matching the 'Machine' and 'Date' with the original DataFrame 'data'
            # Assuming the 'Sum' is in the 18th column (index 17), you will update this cell
            
            # Find the row in the original DataFrame
            original_row_index = data[(data[data.columns[0]] == row['Machine']) & (data[data.columns[2]] == row['Date'])].index.min()
            
            if pd.notnull(original_row_index):
                # Update the 'Sum' in the original DataFrame
                data.at[original_row_index, data.columns[18]] += abs(row['Difference'])

# After processing, save the updated DataFrame back to the Excel file
# We're using 'openpyxl' since it allows us to edit an existing workbook
#from openpyxl import load_workbook
# Carregar o livro existente
#book = openpyxl.load_workbook('Supermercado.xlsx')

# Salvar o livro
with pd.ExcelWriter("Supermercado.xlsx", engine='openpyxl', mode='a', if_sheet_exists='overlay') as writer:            
            data.to_excel(writer, sheet_name="Procura", index=False, header=False)        
            #writer.save()

# If you want to see the result
print(result_df)
Can anyone help me?

Besides i also want to at the same time, make sure that the sum of the colum values for each day is less or equal to the value 'max refinaria' = 500000, and the same process of iteration to be applied.