From what you have shown, the word PROVIDER
is followed by 3-4 (5 ?) lines. The lines you need.
If you can rely on that layout, wherever it is,
and if you have the coordinates of "PROVIDER",
you can figure out the other lines.
The difference is that you have been using Adobe acrobat ( = a piece of software),
while Tesseract or pdfplumber are modules that need to be imported into a python program.
If you use tesseract, the coordinates can be found either with the to_data or the to_boxes option.
I would go for to_data first. Gives you the x/y coord of whole words.
Paul
is followed by 3-4 (5 ?) lines. The lines you need.
If you can rely on that layout, wherever it is,
and if you have the coordinates of "PROVIDER",
you can figure out the other lines.
The difference is that you have been using Adobe acrobat ( = a piece of software),
while Tesseract or pdfplumber are modules that need to be imported into a python program.
If you use tesseract, the coordinates can be found either with the to_data or the to_boxes option.
I would go for to_data first. Gives you the x/y coord of whole words.
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.