Books
- Technically possible for storing library of congress? Yep, just a few Terrabytes. I you store images of books it will cost ( at the moment) $10 a book for you to get it.
- BookMobile - Kids can download, print and bind their own book. Costs $1 per book. A couple in India.
- Out of copyright books are fine, obviously. Out of print books? Lawsuit filed to try and get the orphans back.
Music
- How much is there? A few million (2-3). So that's doable aswell.
- How much access? Jambands, can tape but mustn't make money (ala CC). Tried it but ran out of bandwidth so went to IA. Now got it, loads of stuff there (internet archive) now, 12000 concerts, +30 per day. Bands just need to put it under something like a CC license then can put it up.
- In europe everything from 1954 backwards is out of copyright. Get those 78's quick!
- Bands using the archive as file storage, linking from their sites
Moving images
- Most are copyrighted. Limited. Someone has 600 public domain movies that they are digitising.
- Other stuff that wasn't registered for copyright (when law was different) eg. home movies, art films
Television
- 20 Tb per month
- Made available the week of September 11th. Most of the stuff offline (figuring out rights etc.)
- Mosaic programme, aggregates and translates Middle Eastren tv stations
- Can it be done? BBC leading the way
Software
- About 50,000 titles?
- Problem in the US with rights, ripping software us illegal (can't break copy protection). Trying to copy software off old floppies. Got a copyright exemption that lasts for three years, two years left. They need help! Got to do it quick. Need physical disks and ripped stuff. Send your old software to the Internet Archive.
Web
- Best know for it. Started in the 90's. (missed exact year) Archiving 20 Tb compressed per month (~50Tb). The wayback machine! Surf the web as it was.
- Database runs on a cluster of Linux machines with flat files! ~300Tb database
Storage is lots of Linux boxes as harddrives are cheap.
What about backups? IA is on the San Andreas fault! In process of making 4 backups, eventually 5 or 6 nodes around the world.
Starting up a European archive. 80 machines in a rack (100Tb). They need a Tech director, good job if you're not scared off very large amounts of data.
IA want to put stuff up online that you wouldn't because you don't have space.
Access. One woman built a search engine 4 times the size of Google in two years using IA infrastructure. Now working at Google...oh. Want new ideas that the IA resources can support.
Wireless. Doing free wifi bandwidth.
Will we? Unsure, but getting there, slowly. Could be one of humanity's greatest achievement.
Questions
- Cory D asked why there is a delay in archiving. (BK) Admits it's slow, there is a manual process bit.
- Video access is not the best yet, working on it, need help.
- How do you deal with hardware failures? (BK) 6% hardrive failure per year. IBM study IA! Had a bad time using RAID. Keep it simple.
- HOw do you know how old stuff is? (BK) write it down off the box.
- Have a RSS feed for stuff going into the archive. Cool
- Do you archive illegal things? (BK) We archive everything we can get hold of and make them as accessible as we can (BK phrased it better than that)
- Who will pay for it in 200 years time? (BK) We will.