Trying to parse only 3 key values from json file - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Trying to parse only 3 key values from json file (/thread-37297.html) |
Trying to parse only 3 key values from json file - cubangt - May-24-2022 So im playing around with parsing a json file in python. Im able to read in the file and print it to the console, but now i want to extract 3 values from each "section" not sure what the proper terminology is. Here is a example of the data structure..: "messages": [ { "sender_name": "Me", "timestamp_ms": 1653260883178, "content": "There are plenty of leftovers", "type": "Generic", "is_unsent": false, "is_taken_down": false, "bumped_message_metadata": { "bumped_message": "There are plenty of leftovers", "is_bumped": false } }, { "sender_name": "Me", "timestamp_ms": 1653260872966, "content": "Watching the new scream movie", "type": "Generic", "is_unsent": false, "is_taken_down": false, "bumped_message_metadata": { "bumped_message": "Watching the new scream movie", "is_bumped": false } },I basically need to pull out only the first 3 sets of values and save it into a CSV file. "sender_name": "Me", "timestamp_ms": 1653260883178, "content": "There are plenty of leftovers",Right now i have this basic simple code, but need to figure out how to get within the "message" section and pull out those 3 values per group import json f = open('message_1.json') data = json.load(f) for i in data['messages']: print(i) f.close() RE: Trying to parse only 3 key values from json file - menator01 - May-24-2022 Have a look at dicts RE: Trying to parse only 3 key values from json file - deanhystad - May-25-2022 import json from datetime import datetime json_str = """ { "messages": [ { "sender_name": "Me", "timestamp_ms": 1653260883178, "content": "There are plenty of leftovers", "type": "Generic", "is_unsent": false, "is_taken_down": false, "bumped_message_metadata": { "bumped_message": "There are plenty of leftovers", "is_bumped": false } }, { "sender_name": "Me", "timestamp_ms": 1653260872966, "content": "Watching the new scream movie", "type": "Generic", "is_unsent": false, "is_taken_down": false, "bumped_message_metadata": { "bumped_message": "Watching the new scream movie", "is_bumped": false } } ] } """ data = json.loads(json_str) for message in data["messages"]: timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000) print( f"""{timestamp} from {message["sender_name"]}\n{message["content"]}\n""" )
RE: Trying to parse only 3 key values from json file - cubangt - May-25-2022 So here is what i have and seems to work, now im trying to save this to a CSV so i can test importing it into my excel report import json from datetime import datetime f = open('messages.json') data = json.load(f) for message in data["messages"]: timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000) if 'content' not in message: print( f"""{timestamp} from {message["sender_name"]}\n""" ) else: print( f"""{timestamp} from {message["sender_name"]}\n{message["content"]}\n""" ) f.close() RE: Trying to parse only 3 key values from json file - cubangt - May-25-2022 What am i doing wrong? import json from datetime import datetime import pandas as pd f = open('message_1.json') data = json.load(f) for message in data["messages"]: timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000) if 'content' not in message: rw = pd.DataFrame([timestamp,message["sender_name"],pd.NA], columns=['Date', 'Name', 'Comment']) else: rw = pd.DataFrame([timestamp,message["sender_name"], message["content"]], columns=['Date', 'Name', 'Comment']) rw.to_csv('igMess.csv',columns=["Date", "Name", "Comment"], header=None, index=None, mode='a') f.close()I get this error:
RE: Trying to parse only 3 key values from json file - cubangt - May-25-2022 Ok got past the error and a file generated, BUT not sure how to split out the timestamp so that i have a date and a time separated in the csv import json from datetime import datetime import pandas as pd f = open('message_1.json') data = json.load(f) lv = [] for message in data["messages"]: timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000) date_val = timestamp.strftime('%Y-%m-%d') if 'content' not in message: st = date_val +","+message["sender_name"]+","+"" lv.append(st) else: st = date_val +","+message["sender_name"]+","+message["content"] lv.append(st) df = pd.DataFrame(lv) df.to_csv('igMess.csv', header=None, index=None, mode='a') f.close()the file that was generated when the above was run produced this output: "2022-05-22,Me,There are plenty of leftovers" "2022-05-22,Me,Watching the new scream movie"expected results should be like so: 5/17/22, 5:28 PM,Me: There are plenty of leftovers 5/17/22, 5:28 PM,Me: Watching the new scream movieIf you notice, the generated results have "" around each row and missing the 5:28 PM time.. RE: Trying to parse only 3 key values from json file - cubangt - May-25-2022 ok got the time added and working, so now the only question is how to remove the " " around each row in the file here is the currently working code: import json from datetime import datetime import pandas as pd f = open('message_1.json') data = json.load(f) lv = [] for message in data["messages"]: timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000) date_val = timestamp.strftime('%Y-%m-%d') time_val = timestamp.strftime("%I:%M %p") if 'content' not in message: st = date_val + "," + time_val + "," + message["sender_name"] + "," + "" lv.append(st) else: st = date_val +"," + time_val + "," + message["sender_name"] + "," + message["content"] lv.append(st) df = pd.DataFrame(lv) df.to_csv('igMess.csv', header=None, index=None, mode='a') f.close() RE: Trying to parse only 3 key values from json file - cubangt - Jul-15-2022 So i have been running this a few times since the above post and found a few things, im hoping i can fix in the above code. So i noticed that if a message is very large that it gets split up in my csv file., i only want my csv to have 4 columns Here is the current code the does work, just needs some adjustments to make sure my "content" column is all inclusive and not split out. When i ran this code today against the newest json file, i found data in 4 or 6 other columns, basically had data for certain rows spread across columns A thru M import json from datetime import datetime import pandas as pd import os import csv f = open('message_1.json') data = json.load(f) lv = [] for message in data["messages"]: timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000) lv.append([ timestamp.strftime("%m/%d/%Y"), timestamp.strftime("%I:%M:%S %p"), message["sender_name"], message["content"] if "content" in message else "Media Link"]) df = pd.DataFrame(lv, columns=["Date", "Time", "Sender", "Content"]) df.to_csv('igMess.csv', header=None, index=None, quoting=csv.QUOTE_NONE, escapechar=",", mode='a') f.close() RE: Trying to parse only 3 key values from json file - deanhystad - Jul-16-2022 What are all the possible keys that contain content values? How should the content values be combined? |