Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
pdf question
#1
Hi,
When documents are scanned, using a "bulk" scanner,
the result is either a bunch of tifs or pngs.
Those image formats are percieved as being "future developments safe".

But, for some types of documents, I could follow another route, programmatically,
and end up with 1 large pdf. The size of the pdf being considerably smaller than
individual images. (Server space is at a premium)
The question is : do we think that the pdf format is as "future-safe", compared to all the
pages saved as tifs or png's?
Any opinions ?
thx,
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply
#2
pdf's are a pain in the butt, mainly because of all the formats that may be contained inside, from barely readable images to nice orderly digital tables.

I would choose most any other format if at all necessary.
Reply
#3
(Sep-12-2023, 08:54 PM)Larz60+ Wrote: pdf's are a pain in the butt, mainly because of all the formats that may be contained inside,
Every day I deal with pdfs, indeed with many different things inside.
I have developed various OCR strategies for different types of content.
Pain in the *, yes, but now I have seen them all (I think) I've come to like them. Smile

Some of them, e.g. the ones directly generated from a long lost excel sheet,
offer the possibility to turn them into pngs or keep them as pdf. A search program will move to
the right page in a split second.

I would not ask the question about the future of pdfs, if the storage difference was not like 1 to a 100!
So the bottom line is, will pdf formats always be backwards compatible, will Tiffs be, will pngs be ...
It is not that these conversions are so difficult to do, it is the eons of time needed to process zillions of them.
Any thoughts?
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply
#4
adobe has maintained pdf's for a very long time.

There are way too many documents in archives to even think of dropping support.
Also companies like ReportLab base their entire existance on this format.

I think you can be comfortable that the format will remain for a long time to come.
Reply
#5
(Sep-13-2023, 11:15 AM)Larz60+ Wrote: I think you can be comfortable that the format will remain for a long time to come.
That is what I wanted to hear Smile
In fact, you could say that about the jpg format too.
I bet that every day a 100 million pics are taken using that format.
The continuity of the carrrier technology on which these things are preserved,
is probably of a greater concern. Old 5" or 3 1/2" diskettes e.g.
thx,
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply
#6
FYI:
Adobe made PDF format open source (probably been so at least 10 years now)
You can get the entire Acrobat SDK here.
Reply
#7
(Sep-14-2023, 04:34 PM)Larz60+ Wrote: You can get the entire Acrobat SDK here.
How can a simple thing like a pdf, have a reference guide of 756 pages ? Confused
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply
#8
Not so simple. That is after all, the beauty of well written software, it appears as 'simple' to the user.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020