Noob question - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Noob question (/thread-40689.html) |
Noob question - BSDevo - Sep-07-2023 Hi, im new to Pandas, Python and all this awesomnes ! I learn hard way ... I have few things im confused about. Im trying to sort and visualize with graphs but separate by years. df_mc = pd.DataFrame(df.groupby(['PMarket', 'PuYear']).size().reset_index()) df_mc.columns = ['PMarket', 'PuYear', 'Count'] fig1 = px.bar(df_mc, x="PMarket", y="Count", color='PMarket', range_y=[0,100])But it shows me total of all years combined and selector wont work due to my "groupby function ? Count column created by grouping. My range would be from 2000 but no end date and i want to do it for each month but cant find how to implement exact month as well. So by the end of the day it should be - January - all data from January just need to be filtered by year with selector, next tab is February so same as January but February and so on. ( i have tabs already ) Also i would like to rename fields inside graph and be able to choose a year. I have a selector but it does nothing, my graph does not changes but if you look at THIS IMAGE it separates my bar with dotted line for each year. Also - would it be safe to combine week 52 with week 53 if yes - how ? as far as i understand - week 52 is part of the week 53 or week 53 is part of week 52 and week 1 ? Im also looking for info how to rename things inside charts as px.bar will show column name. And also select Market and show on graph with all years . Lets say it shows all available years from earliest date available in my dat frame and i select area i want to look at to see difference between each year. Add percentage difference from earliest year posible. I got game cd in 2000-1-1 next franchise came out 20010 and i got cd again but different price. i want to use my first CD as 100% and second cd price would show me a difference in percent - was it cheaper or more expensive . I do all of this inside streamlit. Thank You. RE: Noob question - snippsat - Sep-07-2023 You should post a sample of working DataFrame, you do lot explaining for the 4 line of code,but it hard to make any sense of without a example. Same problem your other Thread . RE: Noob question - BSDevo - Sep-07-2023 (Sep-07-2023, 09:55 AM)snippsat Wrote: You should post a sample of working DataFrame, Understood, as im beginner and barelly use forums i thought this should be enough. My code: Reading data from uploaded file @st.cache_data def load_data(path: str): df = pd.read_csv(path, converters={'PZip': str, 'PWeek': str}) # IMPORTANT !!! - to read DATA !!!! df = df.drop_duplicates() return df df = load_data(uploaded_file)Date column separation to Day, Week, Month, Year df['PDate'] = pd.to_datetime(df['PDate']) # skip if your Date column already in datetime format df.insert(7, "PDay", "PDate") df['PDay'] = df['PDate'].dt.day_name() df.insert(7, "PMonth", "PDate") df['PMonth'] = df['PDate'].dt.month_name() df.insert(7, "PYear", "PDate") df['PYear'] = df['PDate'].dt.year df['PWeek'] = df['PDate'].dt.strftime('%V') # Important to get day of the week but atm its 53 weeks instead of 52Section with chart with Year selection with features: df.column = ['Shop', 'PMarket', 'Receiver', 'Buyer', 'PDay', 'PYear', 'PMonth', 'PState', 'PWeek', 'Status'] # Buyer - Me, mom, sister... etc , Receiver - who received the aquired purchase st.write(df) January, February, March, April, May, June, July, August, September, October, November, December = st.tabs(["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]) with January: market_options = df['PMarket'].unique().tolist() market_date_options = df['PDate'].unique().tolist() market_date = st.selectbox('Choose Year', market_date_options, 3) # only 1,2,3 works - dont khow what it is , left for later inspection market_list = st.multiselect('Choose market area', market_options, ['Atlanta']) df = df[df['PMarket'].isin(market_list)] df_mc = pd.DataFrame(df.groupby(['PMarket', 'PDate']).size().reset_index()) df_mc.columns = ['PMarket', 'PDate', 'Count'] fig1 = px.bar(df_mc, x="PMarket", y="Count", color='PMarket', range_y=[0,30], text_auto=True) fig1.update_layout(width=1000) st.write(fig1)I know im missing Month tabs, i just dont know where to put them so it would count by month count . My main goal is to Choose a year and see results From selected year by months in each tab and at the moment it count all years and year selector does not do anything. I posted only January tab as other tabs are empty no code as i think rest of the month should be same code just change code for specific months. ? I hope this should help more to understand my code and help me to solve it. Thank You. RE: Noob question - BSDevo - Sep-08-2023 I sorted out how to choose by year as i missed one option. df = df[df['PMarket'].isin(market_list)] df = df[df['PYear']==market_date]df = df[df['PYear']==market_date] <--- this one allows me to choose year from select list. df_mc.columns = ['PMarket', 'PDate', 'Count']df_mc.columns = ['PMarket', 'Count'] <--- to show on my chart. But now i realised all of this could be done using date picker in streamlit as im having a hard time to converting month, day, week tu datetime. with features: market_options = df['PMarket'].unique().tolist() min_date = pd.to_datetime(df['PuDate'], errors='coerce') # PDate renamed to PuDate [ purchase date ] max_date = pd.to_datetime(df['PuDate'], errors='coerce') value=(min(df['PuDate']), max(df['PuDate'])), market_date = st.date_input( "Date picker", min_value=min(df['PuDate']), max_value=max(df['PuDate']), value=(min(df['PuDate']), max(df['PuDate'])), format="YYYY/MM/DD" ) market_list = st.multiselect('Choose market area', market_options, ['Atlanta']) df = df[df['PMarket'].isin(market_list)] df = df[df['PuDate']==market_date] df_mc = df.groupby(df['PMarket'])['PuDate'].count().reset_index() df_mc.columns = ['PMarket', 'Count'] fig1 = px.bar(df_mc, x="PMarket", y="Count", color='PMarket', range_y=[0,30], text_auto=True) fig1.update_layout(width=1000) st.write(fig1)But im getting an error. Im trying to imply my previous code.
RE: Noob question - deanhystad - Sep-08-2023 From the documentation: https://docs.streamlit.io/library/api-reference/widgets/st.date_input value (datetime.date or datetime.datetime or list/tuple of datetime.date or datetime.datetime or None) The value of this widget when it first renders. If a list/tuple with 0 to 2 date/datetime values is provided, the datepicker will allow users to provide a range. Defaults to today as a single-date picker. st.date_input returns (datetime.date or a tuple with 0-2 dates) I think the format of the return value is set by the format of the value argument. You provided a tuple as the value argument, you should expect a tuple as the return type. I think the return type might also be an empty tuple, so you need to account for that possibility. |