Python Forum
UserWarning: Could not infer format - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: UserWarning: Could not infer format (/thread-40720.html)



UserWarning: Could not infer format - NewBiee - Sep-13-2023

Hi,

I have this code:
def get_rework_flags(logs, clean_df):
    """ Gets the flags used for rework """
    logger = logs['logger']
    # logger.info('Rework: Setting index on clean df')
    # clean_df = clean_df.set_index('Filename')
    logger.info('Rework: Extracting sent items')
    rework_df = clean_df.set_index('Filename')['Clean'].astype(
        str,
    ).str.lower().str.extractall('sent:(.*)').reset_index()
    rework_df.columns = ['Filename', 'date_num', 'date_sent']
    logger.info('Rework: Converting to dates')
    rework_df['date_sent'] = pd.to_datetime(
        rework_df['date_sent'], errors='coerce', exact=False,
    )
And I'm getting this error:
Output:
UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format. rework_df['date_sent'] = pd.to_datetime(
Please help


RE: UserWarning: Could not infer format - Larz60+ - Sep-13-2023

Please show complete, unaltered error traceback as it includes much useful background information about the app.


RE: UserWarning: Could not infer format - deanhystad - Sep-13-2023

And post an example of what is found in rework_df['date_sent'].

Do you need the .astype(str)? If you are searching for "sent:" can the column be anything other than a str?

I would print the dataframe after this step and see if how many columns there are and if one of them looks like a datetime string.
rework_df = (
    clean_df.set_index("Filename")["Clean"]
    .str.lower()
    .str.extractall("sent:(.*)")
    .reset_index()
)
print(rework_df)
The code works if it finds a matching string, and if the date sent column is some kind of datestring.
import pandas as pd

clean_df = pd.DataFrame(
    {
        "Filename": ["A", "B", "C"],
        "Clean": [f"sent: June {x}, 2023" for x in range(1, 4)],
    }
)

rework_df = (
    clean_df.set_index("Filename")["Clean"]
    .str.lower()
    .str.extractall("sent:(.*)")
    .reset_index()
)
print(rework_df)

rework_df.columns = ["Filename", "date_num", "date_sent"]
rework_df["date_sent"] = pd.to_datetime(
    rework_df["date_sent"],
    errors="coerce",
    exact=False,
)
print(rework_df)
  Filename  date_num  date_sent
0        A         0 2023-06-01
1        B         0 2023-06-02
2        C         0 2023-06-03
But it doesn't work for a datetime string like 2023:6:1, or a blank, or None. For those I get the same warning you are getting.


RE: UserWarning: Could not infer format - Neventold - May-10-2024

Thanks for the answer.