Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Unidecode issue
#1
Hi,
In some pdfs I encounter references to the original parish register, like so: ref = ' RP 477; p. 148 r° '
I perform unidecode on all strings in the document : fieldUni = unidecode.unidecode(field).upper()

This has never caused any problems, except in the above case, when i get this: ' RP 477; P. 148 RDEG '

The " ° " has been "translated" into DEG. That is not what is meant here.

How do I avoid this translation in python (other then a manual ctrl-H replace '°' with ... etc.) in the text document?
thx,
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply


Messages In This Thread
Unidecode issue - by DPaul - Sep-02-2023, 06:42 AM
RE: Unidecode issue - by Gribouillis - Sep-02-2023, 08:45 AM
RE: Unidecode issue - by DPaul - Sep-03-2023, 06:07 AM
RE: Unidecode issue - by DPaul - Sep-03-2023, 04:55 PM
RE: Unidecode issue - by Gribouillis - Sep-03-2023, 06:20 PM
RE: Unidecode issue - by DPaul - Sep-04-2023, 06:23 AM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020