Python Forum
Trying to parse only 3 key values from json file - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Trying to parse only 3 key values from json file (/thread-37297.html)



Trying to parse only 3 key values from json file - cubangt - May-24-2022

So im playing around with parsing a json file in python.
Im able to read in the file and print it to the console, but now i want to extract 3 values from each "section" not sure what the proper terminology is.

Here is a example of the data structure..:
  "messages": [
    {
      "sender_name": "Me",
      "timestamp_ms": 1653260883178,
      "content": "There are plenty of leftovers",
      "type": "Generic",
      "is_unsent": false,
      "is_taken_down": false,
      "bumped_message_metadata": {
        "bumped_message": "There are plenty of leftovers",
        "is_bumped": false
      }
    },
    {
      "sender_name": "Me",
      "timestamp_ms": 1653260872966,
      "content": "Watching the new scream movie",
      "type": "Generic",
      "is_unsent": false,
      "is_taken_down": false,
      "bumped_message_metadata": {
        "bumped_message": "Watching the new scream movie",
        "is_bumped": false
      }
    },
I basically need to pull out only the first 3 sets of values and save it into a CSV file.

      "sender_name": "Me",
      "timestamp_ms": 1653260883178,
      "content": "There are plenty of leftovers",
Right now i have this basic simple code, but need to figure out how to get within the "message" section and pull out those 3 values per group

import json

f = open('message_1.json')

data = json.load(f)

for i in data['messages']:
     print(i)

f.close()



RE: Trying to parse only 3 key values from json file - menator01 - May-24-2022

Have a look at dicts


RE: Trying to parse only 3 key values from json file - deanhystad - May-25-2022

import json
from datetime import datetime

json_str = """
{
    "messages": [
        {
            "sender_name": "Me",
            "timestamp_ms": 1653260883178,
            "content": "There are plenty of leftovers",
            "type": "Generic",
            "is_unsent": false,
            "is_taken_down": false,
            "bumped_message_metadata": {
            "bumped_message": "There are plenty of leftovers",
            "is_bumped": false
            }
        },
        {
            "sender_name": "Me",
            "timestamp_ms": 1653260872966,
            "content": "Watching the new scream movie",
            "type": "Generic",
            "is_unsent": false,
            "is_taken_down": false,
            "bumped_message_metadata": {
            "bumped_message": "Watching the new scream movie",
            "is_bumped": false
            }
        }
    ]
}
"""

data = json.loads(json_str)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    print(
        f"""{timestamp} from {message["sender_name"]}\n{message["content"]}\n"""
    )
Output:
2022-05-22 18:08:03.178000 from Me There are plenty of leftovers 2022-05-22 18:07:52.966000 from Me Watching the new scream movie



RE: Trying to parse only 3 key values from json file - cubangt - May-25-2022

So here is what i have and seems to work, now im trying to save this to a CSV so i can test importing it into my excel report

import json
from datetime import datetime

f = open('messages.json')

data = json.load(f)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    if 'content' not in message:
        print(
            f"""{timestamp} from {message["sender_name"]}\n"""
        )
    else:
        print(
            f"""{timestamp} from {message["sender_name"]}\n{message["content"]}\n"""
        )

f.close()



RE: Trying to parse only 3 key values from json file - cubangt - May-25-2022

What am i doing wrong?

import json
from datetime import datetime
import pandas as pd


f = open('message_1.json')

data = json.load(f)

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    if 'content' not in message:
        rw = pd.DataFrame([timestamp,message["sender_name"],pd.NA], columns=['Date', 'Name', 'Comment'])

    else:
        rw = pd.DataFrame([timestamp,message["sender_name"], message["content"]], columns=['Date', 'Name', 'Comment'])


rw.to_csv('igMess.csv',columns=["Date", "Name", "Comment"], header=None, index=None, mode='a')

f.close()
I get this error:

Error:
ValueError: Shape of passed values is (3, 1), indices imply (3, 3)



RE: Trying to parse only 3 key values from json file - cubangt - May-25-2022

Ok got past the error and a file generated, BUT not sure how to split out the timestamp so that i have a date and a time separated in the csv

import json
from datetime import datetime
import pandas as pd


f = open('message_1.json')

data = json.load(f)

lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    date_val = timestamp.strftime('%Y-%m-%d')
    
    if 'content' not in message:
        st = date_val +","+message["sender_name"]+","+""
        lv.append(st)

    else:
        st = date_val +","+message["sender_name"]+","+message["content"]
        lv.append(st)

df = pd.DataFrame(lv)

df.to_csv('igMess.csv', header=None, index=None, mode='a')

f.close()
the file that was generated when the above was run produced this output:

"2022-05-22,Me,There are plenty of leftovers"
"2022-05-22,Me,Watching the new scream movie"
expected results should be like so:

5/17/22, 5:28 PM,Me: There are plenty of leftovers
5/17/22, 5:28 PM,Me: Watching the new scream movie
If you notice, the generated results have "" around each row and missing the 5:28 PM time..


RE: Trying to parse only 3 key values from json file - cubangt - May-25-2022

ok got the time added and working, so now the only question is how to remove the " " around each row in the file

here is the currently working code:

import json
from datetime import datetime
import pandas as pd

f = open('message_1.json')

data = json.load(f)
lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    date_val = timestamp.strftime('%Y-%m-%d')
    time_val = timestamp.strftime("%I:%M %p")
    
    if 'content' not in message:
        st = date_val + "," + time_val + "," + message["sender_name"] + "," + ""
        lv.append(st)

    else:
        st = date_val +"," + time_val + "," + message["sender_name"] + "," + message["content"]
        lv.append(st)

df = pd.DataFrame(lv)

df.to_csv('igMess.csv', header=None, index=None, mode='a')

f.close()



RE: Trying to parse only 3 key values from json file - cubangt - Jul-15-2022

So i have been running this a few times since the above post and found a few things, im hoping i can fix in the above code. So i noticed that if a message is very large that it gets split up in my csv file., i only want my csv to have 4 columns

Here is the current code the does work, just needs some adjustments to make sure my "content" column is all inclusive and not split out. When i ran this code today against the newest json file, i found data in 4 or 6 other columns, basically had data for certain rows spread across columns A thru M

import json
from datetime import datetime
import pandas as pd
import os
import csv

f = open('message_1.json')

data = json.load(f)

lv = []

for message in data["messages"]:
    timestamp = datetime.fromtimestamp(message["timestamp_ms"] / 1000)
    
    lv.append([
        timestamp.strftime("%m/%d/%Y"),
        timestamp.strftime("%I:%M:%S %p"),
        message["sender_name"],
        message["content"] if "content" in message else "Media Link"])  

df = pd.DataFrame(lv, columns=["Date", "Time", "Sender", "Content"])

df.to_csv('igMess.csv', header=None, index=None, quoting=csv.QUOTE_NONE,  escapechar=",", mode='a')

f.close()



RE: Trying to parse only 3 key values from json file - deanhystad - Jul-16-2022

What are all the possible keys that contain content values? How should the content values be combined?