TRIKE Project Plan
- Team Members and roles
- Nancy Foasberg: content strategy and development, editor, OER perspective.
- Rob Garfield: technical development, pedagogical perspective.
- Hannah House: content strategy and development, content project manager, data critique perspective.
- Natasha Ochshorn: content strategy and development, philosophy of DH perspective, communications.
- Sabina Pringle: outreach, technical project management and development.
- Abstract
TRIKE (Transformational Repository for Instruction, Knowledge, and Explication) is an open educational resource providing infrastructure for the sharing of datasets and data transformations alongside humanistic interrogation of the decisions made in selecting and working with that data. As a resource, it is being created primarily for use in courses in the digital humanities, providing pedagogical scaffolding in the form of thematic and textually linked data-based experiments that will model how data-work fits into the humanitarian’s toolbox. Additionally, we envision this modeling will have broader applications benefiting humanities research outside of the classroom by providing any interested researcher with contextualizing information about the decisions made when working with data. We propose development of a prototype website that we hope will be used by faculty partners and their students in Fall 2019. We will evaluate the prototype against predetermined measures of success to determine next steps for a new, funded scope to expand the prototype’s capabilities and reach.
- Environmental scan
Humanitarian Datasets have proliferated, however, while an environmental scan revealed several projects addressing one or more of the modes we plan to work in, few projects marry datasets, methodological instruction, and data criticality, despite calls for this kind of work.
The prototype we propose expands beyond existing resources because it : a) provides access to a varied set of datasets at multiple stages, b) comments on the datasets to illuminate both the constructed nature of all data and the specific choices that were made when handling these datasets, and c) provides nuanced, methodological instruction based in concrete datasets while acknowledging that multiple methodologies may be appropriate for any given data.Existing instructional datasets
Alan Liu’s DH Toychest provides curated lists of digital humanities tools and tutorials for various approaches to data alongside a “starting set” of demo corpora, but does not provide “work-in-progress” snapshots of datasets. The Perseids Project structures datasets for reuse and provides lesson plans for instructors, but is methodologically and disciplinarily specific.Existing methodological tutorials
Collections of methodological tutorials can be found at sites including: Tooling up for Digital Humanities, devdh.org, Digital Art History 101, and Nodegoat. These resources vary in breadth and focus, and provide advice and tutorials for novice DHers, but do not provide sample datasets.
Other projects include data with suggestions of how it might be used. Jonathan Reeve’s Corpus DB provides both data and some Python scripts with which it can be manipulated. The M.O.N.K. Project offers public domain collections and the schemas that go along with them. Making the History of 1989 provides lesson plans and modules alongside primary sources. None represent the choices made by researchers as they work with a corpus.Existing data collections
Electronic, humanities-relevant data collections, both open and toll-access, are plentiful, but vary in the amount of user support and access to data that they provide.Many of the largest collections provide some instruction for researchers, including the Digital Public Library of America’s “DPLA Pro”, Europeana’s “Europeana Pro”, “Chronicling America”, the New York Public Library digital collections and associated API, and the Open Science Framework. Toll-access resources such as JSTOR and primary source databases offered by GALE and Alexander Street Press may also serve as data sources; JSTOR in particular helps to enable this with JSTOR Data for Research. Institutions such as Michigan State University also curate datasets, but do not provide access to the general public (Higgins, Kudzia, and Rodriguez). Other potential data sources include NYC Open Data, HathiTrust, Wikidata, the Vera Institute of Justice, and others.
TRIKE works on a much smaller scale than the resources listed above, but will be much more explicit in its relationship to both pedagogy and methodology.
While instructional datasets, methodological tutorials, and data collections all exist, we did not find a tool that uses a variety of types of real data to demonstrate the process of data preparation and provide tutorials for their instructional use. TRIKE will be a unique resource and will help to fulfill an important need.
- What technologies will be used?
Our site will be developed in WordPress and hosted on the Commons to leverage the existing support base and community around both.- Data files and documentation will be made available on GitHub
- Other Technologies to be determined by the datasets we choose
- Definite:
- data prep, analysis and processing: Python
- Potential:
- Mapping: Carto
- Visualization Tools: Tableau
- Topic Modeling: NLTK or gensim
- Image analysis
- Gephi, Cytoscape, or another tool for network analysis
- Definite:
-
- which of these are known?
-
-
- Rob, Sabina, and Nancy have experience with WordPress
- Sabina has experience with Manifold
- Nancy has experience with Omeka
- Hannah, Rob, Nancy and Sabina are studying Python
- Hannah and Sabina have experience with Tableau
- Hannah has experience with Carto and Cytoscape
-
- which need to be learned?
-
-
- We will need to explore available themes and plugins for WordPress on Commons to find the ones best suited to the organizational framework of our project
- We are all pretty new to Slack, so are experiencing a learning curve there as well
- We will also likely need to learn more about whichever of the analysis tools we choose to use
-
- what’s plan to learn them? what support is needed?
-
-
- Python Users Group (if applicable)
- Consultations with the Digital Fellows and Andie
- Consultations with Jonathan Reeve and Patrick Smyth
- Online Tutorials and other documentation
- NYCDH week workshops, particularly on Omeka and gensim
-
- How will the project be managed?
-
- Slack with Google Drive (and Giphy) integrations
-
- Milestones
- See attached PDF of timeline
Project Timeline
- See attached PDF of timeline




Onward!
Thanks everyone for your energy and engagement last night. We have a very exciting and diverse range of project to look forward to! As a reminder, the college is closed next week so we do not meet as a class. I encourage you to make alternate arrangements and continue on the tasks listed in your work plan. Please feel free to message me and add me to communication spaces as needed! I will be posting feedback on proposals to our Group page tomorrow, and would like to see a revised proposal with the appropriate research, team member responsibilities, etc around the time we meet again (posted to this site, see schedule). In particular, it will be useful to work on a project narrative that you can eventually post on your front-facing web presence to generate audience interest. Some specifics for each group as you move forward:
Database of Immigrant News
Lost Art Collective
Project T.R.I.K.E.
Freedom Dreaming