Python Forum
python/access/pandas
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python/access/pandas
#1
Hi,
I was handed an .mdb. Oldtimers in the centre say that it was started
at the very least 25 years ago. Over the years various people have entered data
through a "form". Access version is long gone, the only communication
with the outside world is a monthly report.
I can handle .accdb with python. however .mdb = problems.
Surprisingly, modern excel will open the mdb and show a nice
spreadsheet with rows and colums. It only looks nice.
I suspect users have entered long texts into the entry boxes of the "form",
and used CRLF, when they should not have. When saving the spreadsheet "as csv",
the records are messed up. (It's an inventory of old parochial registers)
Hence my question, after having read the excel with pandas, and turned it into a dataframe "bib":
for idx,row in bib.iterrows():
            title= row['title']
etc..
Can I eliminate any CRLF from a dataframe field, before saving it to a txt file, or directly into sql.
(I could turn the data into ascci numbers and check for special chars, but there must be abetter solution.)
thx,
Paul
EDIT: I have discovered the culprit.
It seems that you can do : bib.replace('\n','', regex=True).
But now when I save to txt, I find _X000D_ instead of '' (nothing)
What is _X000D_ ?
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply


Messages In This Thread
python/access/pandas - by DPaul - Nov-11-2023, 08:04 AM
RE: python/access/pandas - by buran - Nov-11-2023, 12:44 PM
RE: python/access/pandas - by DPaul - Nov-11-2023, 06:23 PM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020