first project read from xslx get only 1 column\row? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: first project read from xslx get only 1 column\row? (/thread-33926.html) |
first project read from xslx get only 1 column\row? - korenron - Jun-10-2021 Hello I have xlsx file with 6 columns with 100 rows ~ . the first thing I want to do is : Read column A see the data , insert him to a list if it's not appear already so I will have a list of all the names in this column which package it will be best to use ? Thanks, RE: first project read from xslx get only 1 column\row? - Larz60+ - Jun-10-2021 openpyxl will do this PyPi page: https://pypi.org/project/openpyxl/ Docs: https://openpyxl.readthedocs.io/en/stable/: install with pip install openpyxl from command line.
RE: first project read from xslx get only 1 column\row? - korenron - Jun-10-2021 OK I have start to work with this now I have anoter question it's seem that it take too much time if I understand correct it load the all file that start to run on it so I took another file with 1000000 lines and get Error: you have 1048574 lines of data MemoryError During handling of the above exception, another exception occurred:is there any wayto fix this ? this is what I have done: def ReadTxt(): PID_Size = 0 Error = False xlsx_file = Path('SimData', 'C:\\Users\\Computer\\Desktop\\Data.xlsx') wb_obj = openpyxl.load_workbook(xlsx_file) sheet = wb_obj.active print('you have ', str(sheet.max_row), ' lines of data') for row in sheet.iter_rows(min_row=None): PID_Name = row[0].value # this is where the PID located - first place try: PID_Size = len(PID_Name) except TypeError as e: print(PID_Name) # print(e) PID_name = str(PID_Name) PID_Size = len(str(PID_Name)) Error = True if not Error: if PID_Name not in str(PID_LIST) and PID_Size > 4: data = "" for cell in row: data = (data + ":" + str(cell.value)) data = data[1:] index = data.find(':') # print (index) data = data[0:index+1] + " " + data[index+5:] PID_LIST.append(data) for pid in PID_LIST: print(pid)and this is the resualt I'm getting (which is what I wanted) 18FC08F4: D0:2:3D:28:61:26:18:A9 18F3C999: 0:0:0:0:0:FF:5A:0 18FC07F4: B0:13:B0:13:77:1:48:36 18FC1621: 3:C7:0:0:D1:2:FF:E0 0CFE5A2F: 11:0:0:0:D3:F5:FF:F0 18FC1721: F8:2A:20:4E:FF:FF:FE:EE 18FE592F: D2:7C:83:7C:C9:7C:EA:7C 0C000027: 23:FF:FF:F3:FF:FF:FF:F3 . . . . RE: first project read from xslx get only 1 column\row? - Larz60+ - Jun-11-2021 I have used this successfully for quite a while, but never with so much data. I can only suggest that you try some of the other packages available. Here's what available: https://pypi.org/search/?q=xslx&o=-zscore This shows all packages having anything to do with xslx, so a lot of it won't be appropriate for your application. I wish I could help more, but haven't tried any of these. Perhaps others will have a suggestion for you. RE: first project read from xslx get only 1 column\row? - Gribouillis - Jun-11-2021 You could try to convert the xlsx file to csv through external command. Example with libreoffice: This creates file ~/tmp/foo.csv
RE: first project read from xslx get only 1 column\row? - korenron - Jun-13-2021 ok and CVS is much easier to work with ? (I'm not familiar with all excel option at all) it will take less memory ? Thanks , RE: first project read from xslx get only 1 column\row? - Gribouillis - Jun-13-2021 Well, the csv reader in the standard csv module processes the file line by line for example. You don't need to keep more than one line at a time in memory. RE: first project read from xslx get only 1 column\row? - korenron - Jun-13-2021 Ok What I will need to change in the code ? Thanks, RE: first project read from xslx get only 1 column\row? - Gribouillis - Jun-13-2021 korenron Wrote:What I will need to change in the code ?You wouldn't use openpyxl any more. Instead you would iterate on the rows of an object returned by a call to csv.reader() as shown in the documenrtation of the csv module.
RE: first project read from xslx get only 1 column\row? - korenron - Jun-13-2021 I will try it Thank you for the help! |