Pandas Dataframe Filtering based on rows - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Pandas Dataframe Filtering based on rows (/thread-36816.html) |
Pandas Dataframe Filtering based on rows - mvdlm - Apr-02-2022 New to python and this forum I have a dataset like this: df1 = pd.DataFrame({'opA': [1,1,1,1,0], 'opB': [1,1,1,0,1], 'opC': [1,1,1,1,2], 'opD': [0,1,1,0,3], 'Active': ['opC','opD', 'opD', 'opC', 0]}) df1 = df1.rename(index={df1.last_valid_index() : 'Priority'}) df1.loc['Priority','Active'] = '' print(df1)The Active column consists of the OPcolumn name that has the max value in each row, while taking into factor the 'Priority' of each OPcolumn. For this, I have this code working: df = df.sort_index(axis=1,key=lambda x:df.loc['Priority',x],ascending=False) df['Active'] = df.idxmax(axis=1)Now I need to do this: df1 = pd.DataFrame({'opA': [1,1,1,1,0,0], 'opB': [1,1,1,0,1,0], 'opC': [1,1,1,1,2,0], 'opD': [0,1,1,0,3,3], 'Active': ['opC','opC', 'opC', 'opC', 0,0]}) df1 = df1.rename(index={df1.last_valid_index() - 1 : 'Priority'}) df1 = df1.rename(index={df1.last_valid_index() : 'minOccurrence'}) df1.loc['Priority','Active'] = '' df1.loc['minOccurrence','Active'] = '' print(df1)Since opD doesn't have 3 straight "Actives" it isn't active at index 1 or 2 where previously it was Active based on 'Priority' column only. vs. if opD had a 1 at index 0. df1 = pd.DataFrame({'opA': [1,1,1,1,0,0], 'opB': [1,1,1,0,1,0], 'opC': [1,1,1,1,2,0], 'opD': [1,1,1,0,3,3], 'Active': ['opD','opD', 'opD', 'opC', 0,0]}) df1 = df1.rename(index={df1.last_valid_index() - 1 : 'Priority'}) df1 = df1.rename(index={df1.last_valid_index() : 'minOccurrence'}) df1.loc['Priority','Active'] = '' df1.loc['minOccurrence','Active'] = '' print(df1)How do I do this? The minOccurrence row can have any values not just 0,0,0,3. (e.g. 0,1,3,2) |