May-01-2023, 06:43 PM
Hi There,
I know this may seem like a silly question because there might be 1000s of tutorials but I just can't quite figure it out.
I have a data frame and it has a column (columnA) where its an Object but its just a string of text like this:
'columnA': ['680 PACKAGES FOR SOAP NET WEIGHT: 17, 000. 00 KGS']
I want to extract NET WEIGHT: 17, 000. 00 KGS
This is what I've tried thus far:
Also! Sometimes Net Weight comes In other forms like N.W.: or Net Weight; or Net Weight or Net WT: and the endings vary like KGS, KG,
I'm really not sure how further explore this.
I know this may seem like a silly question because there might be 1000s of tutorials but I just can't quite figure it out.
I have a data frame and it has a column (columnA) where its an Object but its just a string of text like this:
'columnA': ['680 PACKAGES FOR SOAP NET WEIGHT: 17, 000. 00 KGS']
I want to extract NET WEIGHT: 17, 000. 00 KGS
This is what I've tried thus far:
df['Net Weight'] = df['columnA'].str.extract('NET WEIGHT: (\d+ KGS)') df['Net Weight'] = df['columnA'].str.extract('NET WEIGHT:? (\d+ KGS)') df['Net Weight'] = df['columnA'].str.extract('NET\sWEIGHT:\s?(\d+\.?\d*\sKGS)') df['Net Weight'] = df['columnA'].apply(lambda x: re.search(r'NET\sWEIGHT:\s([\d,]+\.\d+\sKGS)', x).group(1) if re.search(r'NET\sWEIGHT:\s([\d,]+\.\d+\sKGS)', x) else None)-- Nothing works. It still shows NaN Values - I'm reading the API for re and other pandas.Series.str. doc. and I can't find something to suit my needs.
Also! Sometimes Net Weight comes In other forms like N.W.: or Net Weight; or Net Weight or Net WT: and the endings vary like KGS, KG,
I'm really not sure how further explore this.