CAD Methods Profiles

How do researchers use collections as data?

This series is designed to help people who work in libraries, archives and museums gain a better understanding of common research methods that make use of cultural heritage collections for computational analysis. Of course, these descriptions are simplified versions of the methods, and are described mostly in the context of their implications for the creation, description, packaging, or distribution of collections as data. It should be used in the context of the principles articulated in the Santa Barbara Statement on Collections as Data.

Laurie Allen will serve as editor for the first round of profiles, and we are looking for volunteers to create draft profiles. If you’re willing to volunteer to create a profile, or you’d like to recommend that Laurie try to recruit someone to volunteer to create a methods profile, please email Laurie Allen (

Completed Profiles

Text Mining

  • Laurie Allen and Scott Enderle, University of Pennsylvania

Network Analysis (pending review)

  • Scott Weingart, Carnegie Mellon, Thomas Padilla, UNLV

Needed Profiles

Mapping (assigned)

Image Analysis (needs volunteers)

Audio Analysis (needs volunteers)

Collation (needs volunteers)

Visualization (needs volunteers)

Your idea here (needs volunteers!)

A note about drafting these profiles - The first profile was created by Laurie Allen (a librarian with a deep understanding of libraries and a broad understanding of text mining) with massive help from Scott Enderle (a scholar with a deep understanding of text mining and a broad understanding of libraries). It is hard for someone with deep knowledge of the methods to generalize about it (though Scott is particularly awesome at that) and it is hard for someone without deep knowledge to know what the sticking points are that libraries should look out for.

We encourage profiles to be co-authored in this way, so that they reflect the combined expertise of disciplinary and library colleagues.

Methods Profile Template

What is it?

A one or two sentence massive oversimplification of what it is followed by a few quotes from disciplinary definitions with links to other definitions.

Who uses it?

A rough accounting with some keywords people could look more into about how the method is approached across disciplines.

What form of data is most useful for it?

A one or two sentence super clear statement of what this calls for (e.g. format, content type, structured or unstructured, extent, size, accompanying metadata).

What data features might researchers explore when they’re doing it?

A little more information about how the data is used in this method. Less on process and more on what the researchers are trying to see/test/do.

Common Tools

A list of tools.

Other notes

A catch-all for other notes.

Examples of this method in use

Two to three links or citations from researchers using this method.

Examples of collections optimized for this use

Two to three links to collections that have been prepared with this use in mind.