Digital Filing System

  • Thread starter Thread starter yuryg
  • Start date Start date
Why some digital files are STILL stored like paper files

If I have a variety of applications that create documents for a single project, how do I find them all ? Think spreadsheets, presentations, photos, datafiles, audio, line and vector drawings, scanned-but-not-OCR'ed documents. I would have to remember all of the applications or undertake the separate step of providing search keywords in, say, the document properties (Microsoft Office is my reference here). Not every document for a project will necessarily have in it the keywords that would enable it to be found by a full file-content search.

Further, the performance of full file-content searches is NOT very good unless they are preindexed. The farther we go outside the realm of text to line graphics, datafiles, photos, scanned documents, spreadsheets, audio files, and so forth, the farther we go back into a realm where we are forced to use filing techniques much like those of the paper world to find our documents.

That is why there is still such force to the paper-world metaphors.
 
It seems to me that saving project-related keywords when you save the files is not that enormously difficult. Certainly no more difficult than trying to fit electronic files into a paper-like file structure.

To each their own, of course. I file most documents by application, with subdirectories for my major clients, using intuitive file names whenever possible. Still, for anything older than a month or two, searching usually finds it faster.

Katherine
 
I have divided my huge digital filing system into logical subdivisions.
I will NEVER just dump all my files in one directory, and then use a Desktop Search to find them.
Having them organized actually sparks my thinking, and allows me to browse through them, so its more that just a random database.
I also can use various Search functions if i need to.

I scan mainly using a compressed JPEG, 150 dpi, B&W.
This works fine. It takes about 1 minute a file, including naming and filing the file.
I am only scanning a couple dozen of new files per week, so it works well.
I have a SCAN folder on my scanner, and i just put stuff into that.

Each year, i do things like scan parts of my tax file from 7 years ago, etc, and then i destroy the old doc's. The size of my paper files has really decreased since doing this, as i now have an extensive Digital Archive of all my key docs.
BACKUPS BACKUPS BACKUPS BACKUPS
http://www.langa.com/backups/backups.htm

Coz

some old threads about this..

http://www.davidco.com/forum/showthread.php?t=2505&highlight=scanning

http://www.davidco.com/forum/showthread.php?t=1938&highlight=scanning

http://www.davidco.com/forum/showthread.php?t=1974&highlight=scanning
 
ACROBAT question?

ok, i took the good tip to use Adobe Acrobat to IMPORT from SCAN.
It worked great, i scanned a 4 page doc, and it looks great.
B&W dpi 150.

My question is, is there a way to COMPRESS the file?
The end size of the file is 3x the size when i compressed them as a JPEG.

There has to be a way to shrink it down...

Coz
 
Scan Size & Filing

I have a large library of scanned resources: various math worksheets and instruction sets.

I rarely use Adobe Acrobat unless I'm having trouble with some graphics; I use my scanner's software (Visioneer PaperPort 7) and then print to the Acrobat PDFwriter printer. I scan BW, 300 dpi, and a 2-page file is 70-100kb. The docs look good, and kids can read the small superscripts and subscripts.

Creation is about a minute each. It's fairly boring. I listen to music and play solitaire or minesweeper while I'm doing it: if I try doing anything else, I lose all of the scanning efficiency.

I keep the scanner lid open all of the time, and since I am scanning books, I use a 5 pound weight to keep the book in place.

It's worth setting up the scan for the proper size paper. If you are scanning 8.5 x 11, set for 8.4 x 10.9--you will get rid of a lot of pixel noise that way. I can put up with about 45 minutes (an album) before going crazy.

I've moved PaperPort's main folder from Program Files into My Documents, so that everything gets backed up on a regular basis. (Documents in Program Files folder give me a pain in the neck. Why do app writers continue this?)

Doc filing is fairly easy for me: a worksheet for Algebra Chapter 6 section 2 would file as Algebra/Chapter 6/WS0602.pdf

Best regards
gunns256
 
Ok, that sounds good, i try to keep it about that size.
How do you "print to the Acrobat PDFwriter printer"?
that sounds vaguely familiar...

Coz

gunns256 said:
and then print to the Acrobat PDFwriter printer. I scan BW, 300 dpi, and a 2-page file is 70-100kb.
gunns256
 
gunns256 said:
If you've got Acrobat authoring software, then Acrobat PDFwriter installs as a printer.

If you do not have the $$$ here are some others you can use:

 PDF995 (http://www.pdf995.com/) – This is ad-ware based PDF converter printer driver for the PC. There is a low price commercial option available that will get rid of the adds. [Free / $$$]

 PDF 4 Free – (http://www.pdfpdf.com/index.html) – This is free software that inserts the tag of the software on each of the pages created. Not the optimum, but it is free. [Free]

 PDF Creator (http://sector7g.wurzel6.de/pdfcreator} is an open source project on Source Forge for conversion to PDF. [Free]
 
ok, i figured it out.
You scan using your scanners software, and then select print, and then select the Adobe PDFwriter, and save it as a PDF.
I tried it, and i still have problems with the compression.

Are you compressing the pages BEFORE you send them to the PDFwriter?
You scan them as a TIFF right? Or as something else?
Then do you compress them or convert them BEFORE you save as a PDF?
My scanner scans as a TIFF, and then i usually convert to JPEG and then compress them. Of course, this can be done many ways.
Perhaps your scanner software does this automatically?

Also, how to you handle putting multiple scans into one PDF file?
Perhaps your scanning software is better than mine, but there are likely some functions in there i am not using yet.

Coz

gunns256 said:
If you've got Acrobat authoring software, then Acrobat PDFwriter installs as a printer.
 
Visioneer Paperport

Paperport saves files in their proprietary .max format. It also allows me to "stack" pages, that is, to group several scans into the same document. The .max files are not that large, and I get the 70-100 kb .pdfs when I create them from Paperport. I'm at work right now, so I can't tell you how large the .max files are.

gunns256
 
Yeah, thx. That's what i thought.
I need to get a software that can stack like that, although combining PDF's is not hard.
I will try to set the scan to be more compressed, and this will likely give a better PDF. Paperport must be automatically compressing the files in a pre-determined MAXimum way that works well.
I bet my scanning software can do it, there are lots of buried functions in there.
thx.

Coz

gunns256 said:
Paperport saves files in their proprietary .max format. It also allows me to "stack" pages, that is, to group several scans into the same document. The .max files are not that large, and I get the 70-100 kb .pdfs when I create them from Paperport. I'm at work right now, so I can't tell you how large the .max files are.

gunns256
 
I just downloaded a nice looking 30 page PDF that has lots of diagrams, etc, and it is only 212 KB, for ALL 30 pages!
So that's what, 7 KB a page?

And it looks fine when viewed at 150%.

I want to figure out how to do that.
Anyone know how to get the file that compressed?

Coz
 
Not scanned!

CosmoGTD said:
I just downloaded a nice looking 30 page PDF that has lots of diagrams, etc, and it is only 212 KB, for ALL 30 pages!
So that's what, 7 KB a page?

And it looks fine when viewed at 150%.

I want to figure out how to do that.
Anyone know how to get the file that compressed?

Coz

I am sure that this document was created electronically (for example printed from Word to PDF writer) and all the diagrams were saved as vector graphics - not bitmaps. If you scan into Acrobat it must save the compressed bitmap of the whole page.

TesTeq
 
Hmm, good point.
I checked the properties for that PDF and it says it was created with QuarkXpress Laserwriter, and produced with Acrobat Distiller on a Mac.

so then the next question becomes, how compressed can you make a scanned document, that is saved as a PDF, and have it still be easily readable? (my guess would be 50-100K per page? But maybe it COULD be less?)

And what is the best file to scan it as, compress it, before it is coverted to the PDF?

In the Acrobat compression area, i even set it to HIGH compression, and this didn't do anything.
When i have time later, i will do a search. There must be folks out there who know everything about Acrobat.

Coz

TesTeq said:
I am sure that this document was created electronically (for example printed from Word to PDF writer) and all the diagrams were saved as vector graphics - not bitmaps. If you scan into Acrobat it must save the compressed bitmap of the whole page.

TesTeq
 
PDFZone

Is a place you might want to go. However, the amount of knowledge there was too large for me to take in. Acrobat is used extensively by professional publishers and printers; much of the discussion centers around these aspects of the output.
 
Paperport...

Last fall, I decided to make my small business (appliance repair shop) as paperless as possible - I started this past January 1st. I use PaperPort Deluxe 9 with a multifunction network printer/fax/scanner - a Brother 420CN. I scan all paper into Paperports file type called .max. When I need a PDF, say to email to someone, I 'print to PDF' using the Paperport program. Works perfectly. I sort all scanned files just as if I were placing them into a filing cabinet. I still use the search function a lot though. I scan all paper that crosses my virtual desk every day - about 30 pages or so. Takes less than 10 mins to scan, index and file. While doing that, I do things like - well what I'm doing right now - writing notes or replying to emails. I hardly think about paper anymore as it has become such a non-issue for me. If I need something - hit the search button and I've got it in hand in seconds. I have also put most of last years paper into digital format - took me several good weekends of work. I also scan magazine articles, user manuals, quotes from books, etc. which I keep in various files including my GTD Reference and Future files. For backups, I simply use a IOGear ION external harddrive - plug it into a USB port and drag and drop - not sure exactly how long it takes as it is the last thing I do before I goto bed. I have also scanned about 90% of my service manuals. Paperport allows me to 'stack' them in the same order that they are in the book form which makes reading them about the same as reading the actual manual. I scan at 200dpi in greyscale - seems to work the best for me - easy to read - and the OCR program makes fewer mistakes. As I have my laptop with me where ever I go - I have my office/filing cabinet with me also. Although the initial scanning was extreemly time consuming, the result has been a huge time and stress saver.

Todd
 
Top