Image Analysis and Infrastructure Support for Data Mining the Farm Security Administration: Office of War Information Photography Collection

This paper reports on the initial work and future trajectory of the Image Analysis of the Farm Security Administration -- Office of War Information Photography Collection team, supported through an XSEDE startup grant and Extended Collaborative Support Service (ECSS). The team is developing and utilizing existing algorithms and running them on Comet to analyze the Farm Security Administration - Office of War Information image corpus from 1935-1944, held by the Library of Congress (LOC) and accessible online to the public. The project serves many fields within the humanities, including photography, art, visual rhetoric, linguistics, American history, anthropology, and geography, as well as the general public. Through robust image, metadata, and lexical semantics analysis, researchers will gain deeper insight into photographic techniques and aesthetics employed by FSA photographers, editorial decisions, and overall collection content. By pairing image analysis with metadata analysis, including lexiosemantic extraction, the opportunities for deep data mining of this collection expand even further.

[1]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[2]  Inna Kouper,et al.  Towards Sustainable Curation and Preservation: The SEAD Project's Data Services Approach , 2015, 2015 IEEE 11th International Conference on e-Science.

[3]  Raghu Ramakrishnan Data Management in the Cloud , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[4]  Joe Futrelle,et al.  Medici : A Scalable Multimedia Environment for Research , 2011 .

[5]  R. Alexander Bentley,et al.  2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA , 2013, Big Data 2013.

[6]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[7]  Alex D. Wade,et al.  Research Data Management in the Cloud , 2014 .

[8]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[9]  Nancy Wilkins-Diehr,et al.  XSEDE: Accelerating Scientific Discovery , 2014, Computing in Science & Engineering.

[10]  Rui Liu,et al.  Brown Dog: Leveraging everything towards autocuration , 2015, 2015 IEEE International Conference on Big Data (Big Data).