Forbearing the Digital Dark Age: Metadata for Digital Objects
National Park Service U.S. Department of the Interior
Chris Dietrich
Digital Information Services Program Lakewood, CO Metadata, DB development Data management training NPS Digital Photo Metadata Standard NPS Focus digital library 2
For today:
Digital Object Types Importance of Metadata Types of Metadata Metadata standards and tools for: Photos Documents Audio/Video Geographic Information Systems (GIS) Tabular Data
General recommendations
3
Types of Digital Objects
The most common types: Photos:
JPEG, TIFF
Documents:
Word, PDF
Audio/Video:
MP3, MP4, many others
GIS
Data: SHP, GDB, many types Tabular Data: XLS, CSV, MDB, DBF Sometimes
considered documents
File type ≠ object type! 4
Metadata
5
What is Metadata?
“Data about data” Card catalog information for digital files Describes
Who, What, When, Why, How, etc.
Provides “handles” for managing objects Supports usefulness & longevity
6
Importance of Metadata
Importance of Metadata Digital objects will be with us for a long time! Describes the who, what, when, where, why and how of a digital object Adds information value to digital objects Adds long-term preservation value
7
Types of Metadata
Descriptive – discovery Administrative – management Structural – storage Other – other stuff
8
Descriptive Metadata
Discovery Disambiguation Examples:
Title
Author/Creator Date Keywords
9
Administrative Metadata
Manage/preserve objects in a repository Examples: Access Restrictions What
users can discover an object What users do with an object Provenance History
of the object (origin, editing, conversion, etc.) 10
Structural Metadata
Storage and presentation Logical/physical components
Pages, chapters, volumes, etc.
Examples: HasPart IsPartOf IsRelatedTo
11
Technical Metadata
Instrument settings/properties Date/time Maker & model Settings
Flash,
white balance, macro, etc.
Resolution/quality
12
“Other” Metadata
Rights
Subset of Administrative
Preservation Subset of Administrative PREMIS: next week’s speaker
Geospatial
Subset of Descriptive
Elements may cross over 13
Embedded vs. External Metadata
Embedded = metadata written to file External = metadata stored externally
Companion/sidecar file text,
XML, XMP, spreadsheet
Database/repository
14
Embedded Metadata
Metadata travels with the file Can be extracted, manipulated, rewritten Can be ingested on upload to repository Redact before publishing: Privacy information Sensitive information Location information
May need to sync with repository/DB 15
External Metadata
Can be easily edited in bulk Metadata can be orphaned from file Need to synchronize metadata and file
If file changes, metadata changes
16
Digital Photos
17
Photo File Formats
JPEG Low(er) resolution Distribution/access
TIFF High(er) resolution Archiving/backup
RAW
Digital negative (straight from camera)
Many, many others… 18
Photo File Metadata
EXIF Technical and Descriptive metadata JPEG, TIFF, some RAW
XMP Adobe product (Descriptive) Wider adoption? (Windows)
IPTC
Subsumed by XMP 19
Photo Metadata Tools
Windows Explorer Ubiquitous Requires little training Batch operations
Metadata
input & editing File renaming Discovery
Limited functionality 20
Photo Metadata Tools
Proprietary tools
GPS Photo Link Geospatial
Experts Batch operations Metadata
import Metadata editing File renaming Photo
watermarking Multiple outputs “Geospatial-centric”
21
Photo Metadata Tools
Proprietary tools ThumbsPlus! Adobe Lightroom/Bridge ACDsee Many, many others…
Some create a database for metadata Some will embed metadata Some will sync between photo and DB 22
Photo Metadata Tools
Free/Open source tools
Opanda IExifPro View/edit
all Exif metadata
Windows Live Photo Gallery Free
download “Prep and publish”
Source Forge: http://sourceforge.net Source
for much free/shareware
MANY others…
23
NPS Digital Photo Metadata Standard
Seven required elements: Title Create Date Contact Information Access Constraints Constraints Information Place Description NPS Unit Information
24
NPS Digital Photo Metadata Standard
Seven required elements: Title: who/what, where, when Create Date: born or digitized Contact Information: photographer/steward Access Constraints: copyright, privacy, etc. Constraints Information: describe constraints Place Description: place name NPS Unit Information: park name/code
25
Digital Documents
26
Document File Formats
Word
Very common
PDF Usually created from another document An uneditable snapshot Can contain other types (photos)
Many, many others 27
Document Metadata
Dublin Core Flat, flexible, and easy to use (+) Can be used for any object type (+) Can be imprecise (-) Simple and Qualified implementations
TextMD Technical metadata for text objects LOC supported
28
Document Metadata Tools
Windows Explorer
MS Word
Batch metadata edit & rename Individual file metadata editing
Adobe Acrobat Individual file metadata editing Inherits metadata from Word et al. Advanced/custom metadata elements
29
Digital Audio/Video
30
Audio File Formats
Uncompressed WAV AIFF
Lossless compression MPEG-4 WMA Lossless
Lossy compression MP3 WMA Lossy
31
Audio Uncompressed
WAV
Windows OS compatible
AIFF
Apple/Mac OS compatible
Both support embedded metadata Both suitable for archiving
32
Audio Lossless Compression
MPEG-4 WMA Lossless FLAC All are high-quality All support embedded metadata
33
Audio Lossy Compression
MP3 (≠ MPEG-3) Most common digital audio format Data reduction + faithful reproduction ID3 metadata (embedded):
Title Artist Album Track
number, etc. 34
Audio Lossy Compression
WMA Requires Windows Media Player Common format
Metadata similar to ID3: Title Artist Album Track
number, etc. 35
Video File Formats
Common video file types
Wrappers AVI
(Audio Video Interleave) WMV (Windows Media Video) MOV (Apple QuickTime)
Codecs MPEG-4/H.264
Some wrappers and codecs are incompatible 36
Audio Metadata
AES60-2011 Core audio element set:
Title Creator Subject Description Publisher Contributor Date Type Format Identifier
Source Language Relation Coverage Rights Version Publication History Metadata Provider Entity Type 37
Audio Metadata
AES60-2011 More robust element set than ID3 Maps to ID3 and Dublin Core
“Bridge between cultural-heritage databases, broadcasting production systems and broadcasting archive repositories.”
38
Video Metadata
MPEG-7 Metadata Standard
Uses XML to store metadata
Can
tag events in A/V stream
Complements other MPEG standards Hierarchical, technical, and complex
(Relative
to Dublin Core)
Can be used for audio and video 39
Audio/Video Metadata
Dublin Core for audio Flexible, easy, and widely adopted Supports XML Maps to ID3 embedded audio Useful for interchange “Flat” structure Not a robust technical metadata standard
40
GIS Data
41
GIS: Geographic Information Systems
“Computerized mapping” Governments, utilities, businesses Resource & asset management Data analysis and presentation GPS navigation systems Google Maps/Google Earth 42
GIS Data File Formats
File types SHP – shapefile GDB – geodatabase KML/KMZ – Google earth E00 – coverage Also:
Geospatial
web services Geospatial TIFF, JPEG, and PDF files 43
GIS Metadata
Metadata standards
FGDC CSDGM Federal
Geographic Data Committee Content Standard for Geospatial Metadata
ISO 19115
Geographic Information-Metadata 19115 North American Profile (NAP) 19110 Geographic Information-Methodology 44
GIS Metadata :CSDGM
CSDGM Adopted in 1994, revised 1998 Supports Extensions and Profiles Remote Sensing Extension Biological Profile Shoreline Profile
Required for US Federal agencies
EO 12906 45
GIS Metadata: ISO Standards
Developed to support international exchange ISO19115 Geographic Information-Metadata ISO19115 - North American Profile (NAP)
ISO191 10 Geog. Information-Methodology
Reconciles CSDGM w/ ISO19115 Captures CSDGM Section 5
Tools being developed Not yet widely adopted in the US 46
GIS Metadata Tools
ArcGIS 10.x Most widely used GIS software Supports multiple metadata standards
EPA Metadata Editor (EME) Standalone or ArcGIS extension Flexible/configurable Metadata synchronization and validation
47
Tabular Data
48
Tabular Data File Formats
Spreadsheets, databases, delimited text File types (MANY!) XLS – MS Excel CSV – comma separated value MDB – MS Access DBF – Dbase I - IV XML – Extensible Markup Language
49
Tabular Data Metadata
FGDC CSDGM
Section 5: Entity and Attribute Information
ISO 19110 Analogous to CSDGM Section 5 Separate standard incorporated by reference into ISO 19115
Dublin Core (Qualified?)
50
Tabular Data Metadata Tools
MS Excel: transform and add metadata PDF(/A) XMP embedded metadata Can be used to containerize/wrap May be effective for archiving Validation tools needed(?)
Native software 51
General Recommendations
All object types…
52
Suggested Workflow 1) Inventory 2) Prioritize 3) Categorize 4) Describe 5) Backup 6) Archive 7) Share 53
File Naming Convention
Recommend: KISS: minimum needed Pick/create a standard Use it & stick with it
Example: ParkCode_Year_Description_Seq# YOSE_2013_BirdMonitoring_0001.jpg
Some software does batch renaming 54
Enterprise Metadata Tool
Windows SharePoint “Libraries” Documents, photos, video, data files Capture embedded metadata Require metadata entry Controlled terms & social tagging
Limit on number of files/library & storage Requires configuration/management 55
NPS “T.A.D.A.” Draft Standard
Title Author(/Photographer/Videographer) Date Access
TADA! You’re done! (Not really…)
Remaining required elements
Other metadata: Subjects
– use taxonomy created above? Keywords – user generated
56
Thank you!
[email protected] Post-Webinar Survey http://www.surveymonkey.com/s/2013_DP2
Channel Islands National Park
57