[PRCo] Re: Generic Description
Jim Keener
jimktrains at gmail.com
Sun May 22 12:20:17 EDT 2011
> All that is well and good for a large business organization or perhaps even a large,
> well funded and staffed government or private museum. But do you really think that
> any significant percentage of the enthusiast community is going to go to all that trouble?
> I sincerely doubt it. If CDs and the like have only an anticipated life span of a
> decade--even two decades--there are going to be a lot of wails of anguish down the road
> when those prized, never to be repeated shots have disappeared or become so corrupted as to
> become unusable. It will make the complaints about color shifts in early Ektachrome and fading
> of Anscochrome seem like faint praise, by comparison.
A friend of mine and I set up a 5TB redundant system between my house
and his in NJ. They sync nightly and generally act the way they should.
It's a matter of money, really. The technology is defiantly within the
grasp of the average person, if they know how to use it. I only
mentioned the more expensive NAS because it's more reliable because this
is what it's built to do, though it isn't very difficult to "role your own."
The problem with DVDs is that you cannot access them from anywhere and
you need to have some index to be able to know what is on each DVD. We
could just store the raw scans on DVD, I guess, and have all the
metadata and compressed images on live disk. That would be a lot less
intensive. The DVDs would still need to be checked for consistency
often, though.
The issue you mentioned about people crying because their images are
gone is already happening with some of the first generation CD
writables. People don't understand the basics of digital storage,
that's the problem.
I don't see it as trouble, really. It's what needs to be done to
preserve digital records. Would you say that proper care of a car is not
worth it because a sig. portion of the community isn't willing or
doesn't know how to do it? If we want these records in a digital
format, these are the steps that need to be taken.
I feel that we should have these records in a digital format for
multiple reasons:
1) If properly done they are timeless.
2) They are accessible from anywhere
3) They are easily mutable.
4) They are easily associated with each other and other metadata.
5) We can do "cool" things with them that we couldn't do with the
physical prints. (e.g. the Google Maps example below and quickly
querying for locations or cars. I'm sure others can think of more)
That's I guess what would need to be decided: do we want digital records
of all, some, or none of our archives. I feel we should store as much
of the archive as we can digitally. The time and money required to do
this versus the benefits of doing it is, as anything, debatable. If we
do create digital archives, we should store things correctly and make
sure that it has all the advantages of being digital (easily accessible,
mutable, &c).
>
> Nor is the solution likely to be to print out each image and store it. Unless one is willing and
> able to purchase and use archival inks and paper, the printed images won't last either. At least
> not in the average enthusiast's storage conditions. And probably not in the typical railway
> enthusiast museum's storage conditions, either.
>
I meant print out the metadata (captions, locations, date, &c), not the
images. The ink alone would cost a small fortune. I just meant that if
people insisted, we could print out all the captions &c.
> We are accustomed to, and spoiled by, the relative permanence of silver halide prints.
> We now live in an age where that technology is rapidly disappearing in common use.
> If the projections you make, and I have seen similar ones elsewhere, are even half accurate,
> our progeny is likely to be confronted with a situation in, say, fifty years where there are
> ample photographs available (with or without captions) of twentieth century tramways and railways,
> but very little of the same for the early twenty-first century! Having gone digital in 2000, I am
> no exception.
>
> But unless we are more successful in regenerating the stock of tramway enthusiasts than we have
> been, few will care anyway by then!
Few will always care by any definition. If we doubled or tripled the
people who actively cared and contributed to this and other museums, it
will still be a few. Having records that are accessible to anyone (or
only some) whenever they want is one step in the direction of gathering
people's attention.
Imagine a console in the museum lobby. It has a Google Map up with some
markers. People would be able to zoom in on the marks around places
they know and be able to see images of trolleys there. It becomes a
much more personal connection.
Jim
>
> Dwight
> ----- Original Message -----
> From: Derrick Brashear
> To: pittsburgh-railways at dementia.org
> Sent: Sunday, 22 May, 2011 11:19
> Subject: [PRCo] Re: Generic Description
>
>
> The Mime multipart apparently confused ecartis. Here's what he said
> before it got encapsulated and sealed away.
>
>
> ---
>
> The longevity of digital records is a valid concern. Home-burned CD's
> and DVD's are only good for a decade or two, though newer media is
> better (http://www.thexlab.com/faqs/opticalmedialongevity.html). Hard
> disks fail; computers die.
>
> The best way to handle digital information is to make it redundant.
> Additionally, the best way to make sure it will be readable is to use
> formats that are open and will still have programs that can read them X
> years from now.
>
> I'll try to find concrete requirements from other organizations, but in
> my experience is that
>
> 1. Multiple RAID-6* or NAS systems in different buildings that sync
> often (hourly or nightly)
> 2. Have backups* at buildings different from ones with the systems
> 3. Have copies of backups at different buildings
> 4. Test backups regularly
> 5. Keep gradated backups
> 6. Storage space required to backup data is ~3-4 times the actual data
>
> * RAID-6: given X identically sized disks (of size y) provides (X-2)*y
> storage space and guards against 2 disks failing before recovering
> information cannot happen with 100% accuracy. Often a "Hot spare" is
> also used which stores no data, but if a disk fails the system
> automatically rebuilds the raid array using the spare disk. Raid is not
> a back up. Raid helps recover from hardware failure.
>
> * Backups: Can be another running system, optical media, or tapes stored
> at a different location from the main system.
>
> I'm estimating that with 200,000 images (slides, negatives, prints,
> glass slides) (correct me if I'm too far out of the ball park), each
> scan will be on the order of 30MB (kodachrome slide at 2000 dpi 4bytes
> per channel and some fudge). We would also have some compressed
> versions and thumbnails, so let's say another 6MB. We'll probably have
> about a megabyte of metadata (praise anyone who's typing more than a
> million characters as a caption per slide?) So, we're at 40MB (give or
> take). That gives us a grand total of about 7TB of data to store.
>
> A system capable of storing that much data would run (with commodity
> hardware) around a one to two thousand dollars (see end of email). Tape
> drives that store TBs of data can be expensive. NAS (Network attached
> storage) are also very useful.
>
> It might be best to make DVD's incrementally as we add the initial
> data. As metadata changes (I assume images won't change often) we can
> add those to DVDs incrementally as they are changed or at the end of the
> day. The DVD's should be made in duplicate and tested yearly, though.
> I don't have enough information on the longevity of BluRay disks to
> recommend them at this point.
>
> The key points with digital information though are:
>
> 1. Have duplicates store separately and securely
> 2. Have spare (hot) media on hand
> 3. Test those duplicates regularly
>
> Whether it be DVDs or hard disks, those 3 points apply equally.
>
> The graduated bacup part above is important so that, if there is a
> problem found (data corruption, malware (though unlikely), or just
> "someone delete that!?!?!" we can go back in time and find the file at
> some point (hopefully in the not so distant past). The amount of time
> we want to go back will also affect how much space we need. Though, if
> the images don't change often/at all, storing all the changes in
> metadata and system configuration is next to trivial.
>
> Having the backups stored at separate facilities is very important in
> the case of natural disaster and fire.t
>
> I'm still looking for proper documentation from universities, museums,
> libraries, or government on proper procedures, but this is a good primer
> on it. The basic idea is to accept that things fail, and instead of
> trying to prevent it (like you would with hard copies (concrete
> buildings, fire suppression, &c)) you simply live with it (through
> physical redundancy, though fire suppression &c helps too;) ).
>
> Just to stave off any possible misinterpretations, securing the physical
> media is a definite must. I feel that having digital copies and records
> of that media will aid in securing it, reduces the times it is actually
> needed to be accessed, and allow the records to be better curated and
> edited, because anyone (we let), anywhere could look over a record and
> make it better. If need be we could also keep hard copies of all of the
> (metadata) records, though that's a lot of paper.
>
> This is not limited just to images, either. The books and other paper
> records at the museum can also be scanned, OCRed (optical character
> recognition, image to text basically), and made searchable and viewable
> without having to handle the paper medium. Everything I just said
> applies equally as well to scanned documents.
>
> Hope any or all of this is helpful,
> Jim
>
> On 5/22/11 12:51 AM, Dwight Long wrote:
> > Fred
> >
> > Just keep those glassine negatives away from humidity. I had all my older
> > (50s-70s) in same but somewhere along the line Delaware humidity caused the
> > glue on the envelopes to seal, but worse, in many cases the envelopes to
> > adhere to the negatives. Major restoration effort required to salvage them.
> >
> > My later ones are in archival sleeves. Hopefully the material will perform
> > as advertised.
> >
> > Of somewhat greater concern is the longevity of electronically recorded
> > data. Will images recorded on CDs, hard drives, DVDs, etc., still be around
> > in x number of years?
> >
> > Dwight
> >
> > ----- Original Message -----
> > From: "Fred Schneider" <fwschneider at comcast.net>
> > To: <pittsburgh-railways at dementia.org>
> > Sent: Friday, May 20, 2011 7:17 PM
> > Subject: [PRCo] Re: Generic Description
> >
> >
> >> Jim:
> >>
> >> I was trying to throw out the complications.
> >>
> >> Ed Lybarger, with his fabulous sense of humor, explains how complicated
> >> it can really be by placing a number on the door of the library at PTM.
> >> The number on that door is the Dewey Decimal System number for railways.
> >> He is telling us that everything in that room is one number in the time
> >> honored library cataloging system and by inference that the standard
> >> system doesn't work at all when you have 10,000 square feet of floor space
> >> covered with stuff all meeting the same definition. (Actually Dewey used
> >> 385 and 625 and we probably would not know how to split them. The first
> >> was transportation; the second was technology. Can you visualize the guys
> >> arguing over which is which? He had no separate category for trolleys;
> >> just railroads, although the on line reference I have only shows the first
> >> three digits ... railroads and highways all are in 625. After the decimal
> >> we might split it into trolleys.
> >>
> >> Now you have to find a new system and it needs to be a system that works
> >> not only for the aficionados who collected that crap but for the people
> >> who know absolutely nothing about it. The hired educator for the museum
> >> who has to teach children about trolleys has to be able to find what she
> >> wants in the library without becoming discouraged. The director needs to
> >> be able to use it to answer a newspaper's query. The librarian needs to
> >> be able to find the pictures we have from Williamsport when someone wants
> >> to do a book. Hopefully the library will also be a resource that
> >> contains more than just pictures and an occasional engineering drawing;
> >> wouldn't it be nice if it also contains financial and business records
> >> about the industry?
> >>
> >> I can tell you a lot of the problems but I personally cannot be there much
> >> of the time because I live four and a half hours away. If I were five
> >> miles away, I would probably be there two days a week but I'm not there.
> >>
> >> The ideal way of archiving collections is to put everything in a standard
> >> data base. Sometimes you simply don't do what you know you should
> >> because you have so little free time that you must attack those things
> >> that were not done in any way at all and ignore those that were done.
> >>
> >> For example ... my trolley negatives are there. They are already in
> >> individual glassine envelopes with the negative and the envelope each
> >> bearing a file number. Perhaps it is not the best storage medium but it
> >> is workable as long as they are all safety film (and all except a handful
> >> are). All have been numbered from T-1 up to T-3000 something. There is
> >> a loose leaf index that describes each one in numerical order. There is
> >> also a file of photographic proof sheets in order by company, i.e. all of
> >> the Pittsburgh negatives were pulled out and proofed and then those 8x10
> >> proofs are in a Pittsburgh folder. All the Washington negatives were
> >> proofed and in a Washington folder. And so forth. Now, even if that
> >> does not meet your standards, do you mess with it or do you simply leave
> >> Fred's filing alone? Answer, until everything else is done, you probably
> >> wisely leave Fred's system alone because you don't have the money to redo
> >> it. You spend precious resources on th!
> >> e negatives that are not identified and those that are not in acid free
> >> envelopes. So what do you do with Fred's? You probably put an FS in
> >> front of his number (or something else unique to help you find them) and
> >> then copy his file into a data base as simply as possible and scan them
> >> ... you make it a KISS project because there are too many other projects
> >> screaming for help.
> >>
> >> More important might be to take all of the thousands of negatives I
> >> brought over from the Goldsmith and Watts collections that are mostly on
> >> non-safety film (highly combustable) and refile them in open sided, acid
> >> free envelopes and then build a concrete vault away from the main building
> >> to house all the combustible negatives.... Can you see the need for
> >> millions of dollars?
> >>
> >> If you are not familiar, remember the words SAFETY FILM on the edge of
> >> films produced in the 1940s and 1950s? As long as there were still some
> >> older combustable materials produced, the newer cellulose acetate
> >> materials were labeled SAFETY FILM. When we moved from glass plates to
> >> flexible materials, the films were made of cellulose sodium nitrate. It
> >> will, if stored in stacks, spontaneously combust. It needs to breath. If
> >> you get enough of that crap, it will blow the roof off a building.
> >> Theater movie projectors were designed with very sophisticated light
> >> baffles so that if the motor quit running, the light would also be shut
> >> off to prevent combustion of the film. My father remembered a major
> >> theater fire in Cleveland in the late 1920s. I've been told that an
> >> entire 800 foot, 20 minute reel of 35mm film could easily go up in smoke
> >> in seconds. The Hippodrome in Lancaster was gutted in the early 1920s
> >> ... same reason. Eventually it became law that projection !
> >> booths in theaters had to be surrounded by concrete!
> >>
> >> By the late 1930s we were producing films on cellulose acetate ... but
> >> some photographers still bought the cheaper stuff. I know my father
> >> still found some nitrate base 35mm film right after World War II ... he
> >> had that 35mm film rolled up and it basically turned to jello.
> >>
> >> That should give you a clue that a lot of the collections from older
> >> railfans are time bombs.
> >>
> >> Those images on sheet films made of cellulose sodium nitrate are largely
> >> lost because the thicker the base, the more likely it was to decompose.
> >> I remember Harold Cox telling me that most of the Philadelphia Rapid
> >> Transit archive from the end of the glass era until the beginning of the
> >> safety film era had virtually vanished because it was professionally done
> >> on thick sheet film negatives and they simple decomposed to flammable
> >> dust! (Thinner roll film negatives were more permanent.)
> >>
> >> So, Jim, do you worry about what Fred did with his collection? I don 't
> >> think so. It is not done in a fashion which I believe suitable for future
> >> users. I wrote it like a railfan. The journal reads: "Company, car
> >> number, direction, location, date and any other relevant items we might
> >> like. If I were redoing it today for a new generation of users, I would
> >> probably put city, county, state, date right in the first position. But
> >> there is a record that someone can work with in a few years when I'm gone.
> >>
> >> Fred
> >>
> >> (Only proof read once ... if you don't understand something, ask.)
> >>
> >>
> >>
> >> On May 20, 2011, at 4:29 PM, Jim Keener wrote:
> >>
> >>> Sorry for my naivit�. I guess I'm trying to jump into a discussion I
> >>> haven't been involved in before and might not know pre-existing
> >>> protocols. I've done databasing and cataloguing of things, but never
> >>> really archiving before. I'm also not familiar with how other museums
> >>> arrange their archives.
> >>>> 1) The title that includes company and car number is bad because you
> >>>> might have, in a museum such as ours, a hundred identical titles.
> >>>>
> >>>> 2) That description: "West Penn. FT 3. Connellsville Shops." is
> >>>> apparently what Frank put on the slide and it means nothing to the
> >>>> average person. If you come to the museum from Pocatello, Idaho, what
> >>>> does Connellsville shops mean? But a descriptor that reads "Company
> >>>> car repair facility in Connellsville, PA" might be understandable. And
> >>>> what does that FT 3 indicate. Be damned if I have a clue.
> >>>>
> >>> While not an ideal situation, it's at least something. For instance,
> >>> "West Penn. FT 3. Connellsville Shops." doesn't really mean much to an
> >>> outsider. However, someone can come along later and flush it out
> >>> later. Especially if these are all scanned in and in database, it's
> >>> trivial to change the captions and keep track of the changes. Even if
> >>> they are captions on paper, it can be changed later, but at least
> >>> something is there and initial time can be spent towards ones with
> >>> poorer captions (e.g.: company and car number with no location).
> >>>> A description should probably start with a file number or archive
> >>>> number. Next we probably need to figure out who the user is and what
> >>>> he wants. Does he want to find West Penn Railways? Or does he want
> >>>> to find trolleys from Uniontown, PA? Or might he be interested in
> >>>> trolleys from Fayette County, Pennsylvania? Or Southwestern
> >>>> Pennsylvania? All of these are possible descriptors that we might wish
> >>>> to use to help the user find something. Remember guys, we're looking
> >>>> at this as rail fanatics. The ultimate user might not be one of us.
> >>>> He might simply be a transport historian or a historian in general 50
> >>>> years from now. Incorporating the car number into the descriptor might
> >>>> be a minor thing for the user we will be serving. (I am a railway
> >>>> historian trying to think how someone else might want to use our files
> >>>> when we are not here. I can look at the declining number of hobbyists
> >>>> in groups like the NRHS or the ERA and understand that we won't be
> >>>> here.)
> >>> Will the database be electronic, or do you want a lot of information on
> >>> the physical slide and in the record number? If its electronic a record
> >>> Id on the slide might suffice? Otherwise, the identifier on the slide
> >>> could contain encoded information. <map grid> <company> <car #> <year>
> >>> <record id>. The map grid could be designed to flow so that someone
> >>> looking through the physical archives wouldn't have to skip around all
> >>> too much to view someone geographically close. Lexically sorting by the
> >>> order suggested would have the records sorted in a psuedo-geographic
> >>> manner and then grouping by company and car.
> >>>
> >>>> Countless hours? Again, nothing is impossible for those who are not
> >>>> doing it. If you have 200,000 photos that need to be captioned and
> >>>> it takes an average of 15 minutes to do a caption, we are talking 24 man
> >>>> years. Is that a safe number for the collection. Might be. My own
> >>>> collection is close to 50,000 prints and I am simply extrapolating from
> >>>> the number of file cases.
> >>>> I have not hauled the other file cases out to Washington yet. I might
> >>>> add that PTM also has my albums already and that might include another
> >>>> 5,000 prints or six months worth of full time data entry. Did I hear
> >>>> anyone volunteering?
> >>>>
> >>> I'd be near useless identifying places outside of the city, but I would
> >>> be able to scan and/or enter descriptions into a database. Doubly so if
> >>> I could take a small deck of slides home each week and do them at nights
> >>> and mornings when I have small bits of time to spare, though I don't
> >>> have a slide scanner at home.
> >>>> Ray, a simple description is fine. One that reads West Penn 700-type
> >>>> car on the Fairchance line believed to be near Hopwood about 1948 is OK
> >>>> until you refine it. But it requires historians willing to write such
> >>>> words as "believed" or "unknown" or "suspected" or "circa" or "about"
> >>>> when we do not know for certain.
> >>> Is it uncommon for people to mark their captions with uncertainty? Do
> >>> they just refuse to write them or write them with certainty?
> >>>> Perhaps trolley near Hopwood, Fayette County, Pennsylvania circa 1948
> >>>> might even be better for the future user with the railfan details buried
> >>>> farther down in the description.
> >>>>
> >>>> Regardless, what is written needs to be correct and there are thousands
> >>>> of pictures and slides which were never captioned. The guys that
> >>>> volunteer simply look at Ed and say what's this. Then he throws them
> >>>> in a pile and waits for Fred to appear. There are still going to be a
> >>>> large number that I don't know. We need more resources.
> >>>>
> >>>> When I edited Headlights magazine 40 years ago and someone gave me a
> >>>> picture that they couldn't identify, I used it to fill space. It
> >>>> became a Can you identify this? feature. But we had national
> >>>> circulation. We usually found out. Unfortunately doing the same in
> >>>> Trolley Fare probably won't get us the same following.
> >>>>
> >>> A friend of a friend did this: http://retrographer.org/ I don't know
> >>> how useful it would be in helping us though. I'm not sure of their
> >>> traffic volume.
> >>>
> >>> Also, wouldn't it be OK to scan in slides and negatives as-is and
> >>> caption them with all the information on the slide (if any) and caption
> >>> them later? It would be easier on the physical media to not have to be
> >>> handled as people try to figure out where it was taken and what is in
> >>> it. It would also make it easier for the general public to browse.
> >>>
> >>> I could also imagine some computer vision (CV) or artificial
> >>> intelligence (AI) students at CMU or Pitt having fun (doing a school
> >>> project) trying to guess locations, which would then have to be approved
> >>> by a human. It'd only be useful with a reference of some kind in part
> >>> of the picture, however, but there are good/decent archives of much of
> >>> what's in the city as well as how extensive Google Street View is around
> >>> the city which could help. Just a thought ::shrug::
> >>>
> >>> Jim
> >>>
> >>>
> >>> -- Attached file removed by Ecartis and put at URL below --
> >>> -- Type: application/pgp-signature
> >>> -- Desc: OpenPGP digital signature
> >>> -- Size: 901 bytes
> >>> -- URL :
> >>> http://lists.dementia.org/files/pittsburgh-railways/02-signature.asc
> >>>
> >>>
> >>>
> >>
> >>
> >
>
>
> On Sun, May 22, 2011 at 11:16 AM, Jim Keener <jimktrains at gmail.com> wrote:
> > -- Attached file removed by Ecartis and put at URL below --
> > -- Type: text/plain
> > -- Size: 19k (19473 bytes)
> > -- URL : http://lists.dementia.org/files/pittsburgh-railways/ecartjjX7Wl
> >
> >
> >
> > -- Attached file removed by Ecartis and put at URL below --
> > -- Type: application/pgp-signature
> > -- Desc: OpenPGP digital signature
> > -- Size: 906 bytes
> > -- URL : http://lists.dementia.org/files/pittsburgh-railways/03-signature.asc
> >
> >
> >
> >
>
>
>
> --
> Derrick
>
>
>
>
-- Attached file removed by Ecartis and put at URL below --
-- Type: application/pgp-signature
-- Desc: OpenPGP digital signature
-- Size: 906 bytes
-- URL : http://lists.dementia.org/files/pittsburgh-railways/05-signature.asc
More information about the Pittsburgh-railways
mailing list