Syllabus (Spring, 2017)

The first iteration of Distant Reading was built around four projects and one warm-up project:

Digital Humanities Warm-Up (Week One)
Project 1: Text Analysis of Hamlet (Week Two)
Project 2: Text Analysis of Your Own Writing (Week Three)
Project 3: Independent Project #1 (Weeks Four through Six)
Project 4: Independent Project #2 (Weeks Seven through Nine)

See below for detailed descriptions of each assignment.  As licensed under a Creative Commons Attribution 4.0 International License, this curriculum is freely usable with attribution.  Feedback also welcome.

Creative Commons License


Week One: Digital Humanities Warm-Up

Google nGRam Viewer shows us how many times a word appears in a large corpora (body) of printed literature in English.  Or rather, it shows the percentage of that corpora made up by an individual word.  This is pretty amazing!  What can you do with this?

    • What happens when you put in the names of countries, like “China” and “France”?  
    • What about the words “science” and “literature”?
    • What about: “be or not to be” and “darkness visible” and “a more perfect union”?
    • What about: “apple” and “Apple”?  
    • What about: “lacrosse” and “Frisbee”?
    • What about: “David Matthews”?
    • What about: “Row Your Boat” and “Mary Had a Little Lamb” and “Itsy Bitsy Spider”?
    • What about: “December 25” and “December 7” and “September 11”? (Shift the end date to 2010)

Critically, what questions does each search prompt for you?  Your assignment for tomorrow and Friday is to play with Google nGram Viewer and report back what you find, how you found it, and what significance you might draw from it.  This pre-assignment spans the next two classes.

For Thursday:

Pick topics of interest to you, and play with Google nGram Viewer.  Search for words and phrases.  Use verbs and nouns.  People, places, events–as in the examples above. Pay attention to four things:

    • First: What do you see? What surprises you? What theories can you devise about what you see?
    • Second: How can you test your theories with other searches?  Do it!
    • Third: As you experiment, pay attention to how you change what you search for.  How do you change your search as you grow more familiar with this tool?  Why? Where do you end up?
    • Fourth, consider: What are the opportunities and limits of this search tool?

Be prepared to give a short presentation to the class on these four questions.  Show your work!  

(Note: use to do screen captures, and drag the image into your document.)

For Friday:

In a journal entry, write up the results of your searches with Google nGram Viewer, including anything you learned further from our class discussions.  Though this is a journal entry, it should be persuasive and in your best prose.  It’s an opportunity to discuss and explain four lines of inquiry:

    • What did you see?  What theories emerged for you?
    • How did you test your theories?  What were the results?
    • What implications or conclusions did you draw from the results?
    • What are your reflections on the opportunities and limits of Google nGram Viewer?

The entry should be around 500 – 750 words, and will be submitted before the start of class on Friday.

[Link to PDF version of assignment]


Project 1: Text Analysis of Hamlet (Week 2)

Introduction

What are the range of possible questions we can answer when we use computers to look at texts? In Network Theory, Plot Analysis, Franco Moretti begins to explore these questions, thinking about the potential uses (and limitations) of quantitative data in the work of literary criticism.  In this assessment, we will follow a path similar to Moretti’s, beginning to grow familiar with Wolfram Language and its vast capabilities and thinking about the many ways in which computers allow us to take a different kind of look at texts. We will “read” Hamlet, but from a distance.

Here is Part 1 of a weeklong project that will be due Monday. We’ll share more later this week!

Part 1:

  1. Find, import, and clean (as needed) an online, full text version of Hamlet.
  1. Using Wolfram Language, perform some computational linguistic analyses on Hamlet:
    • How many unique words are in the text?
    • Select several words and make histograms of their placement in the text.
    • What is the average sentence length?
    • What is the average word length?
    • What percentage of words are stopwords?
  1. Choose at least two other analyses to perform.  These might include:
    • Perform a sentiment analysis of sentences in the text.  Graph it.
    • Determine the prevalence of different parts of speech.
    • Determine the Flesch-Kincaid grade level of the text.
    • Find the geographic place names in the text and plot them on a map.
    • Or anything else you can dream up!  (The documentation center might prove helpful.)
  1. Then, choose at least two other texts from Project Gutenberg you find interesting.  
    • Import and clean them, and then perform the same analyses.
    • Compare results across the three texts.
  1. Save your work in a Mathematica notebook.  Use Headings and plain text to comment on it.

A good target date for finishing this part is Thursday or Friday.

Text Analysis of Hamlet – Parts 2 & 3

Now that we have performed some basic linguistic analyses, let’s consider these analyses in context.  What do they provide?  What don’t they provide?  To pursue these questions with more focus, continue onto Parts 2 and 3:

Part 2:

  1. When you have finished your text analyses, read Network Theory, Plot Analysis. Consider: what does this pamphlet add to our understanding of the work we have just performed? Or does it just confirm what we already know about Hamlet?  Annotate the reading as you go.  

Part 3:

  1. Compose a 750-1000 word interpretation of Hamlet, based on the results of your analysis. Your interpretation should explore the following questions:
    • What did your computational analysis tell you that you could not have learned simply from reading Hamlet?  In other words: What is the significance of the data that your analyses revealed?
    • What are the limits to your ability to analyse the texts through these methods?  Refer to Network Theory, Plot Analysis as appropriate.

Your annotated Mathematica notebook and your writing are due at the beginning of class on Tuesday, April 11.  In class that day, you’ll share your results with each other.  

~

Your work will be assessed on:

  • Your code submission:
    • Have I successfully completed the required commands?
    • Did I annotate and comment on my code?
    • Is the code easily readable?
  • A written reflection:
    • Is my writing persuasive and driven by a main argument or set of arguments?
    • Does my reflection contain thorough, substantial responses to each of the questions in part 3, thoughtfully using evidence to support points made?
    • Does my reflection demonstrate creative engagement with the central ideas in Network Theory, Plot Analysis?
    • Do I use the questions in the assessment as a starting point to come up with more questions, ideas, possible experiments, beginning thoughts about the nature of distant reading/digital humanities?

[Link to PDF version of assignment]


Project 2: Text Analysis of Your Writing (Week 3)

Introduction

The Reputation Squad team at Medium writes that their computational analysis of the speeches of Donald Trump, Hillary Clinton, and Barack Obama reveals the “hidden components of their political communication.” What might the hidden components of your own communications be? What can a computational analysis tell us about your writing throughout your Deerfield or high school career? What can Wolfram Language allow you to see about your writing that you might not have otherwise seen? What might be missing from this picture of you as a writer? In this assessment, we will deepen our familiarity with Wolfram Language and continue thinking about the particular kinds of questions computational analyses can help us answer.

Part 1:

  1. Create a file to analyze.  Gather documents you have composed for your classes at Deerfield and perhaps beyond.  Copy and paste the text of those files into one large file that contains all of your writing from Deerfield or from high school.  Save it as a .txt file.
    • Consider whether or not you should include titles, headers, works cited, etc.
    • Consider the order in which your writing appears in the document.  
    • Consider adding annotations to separate each year.  These might be markers that you can clean for certain analyses and keep to segment the text.  See step 4, below.
  1. Import and clean (as needed) the full text of your writing.
  1. Consider segmenting the text.  Can you create strings of different grade levels, or of writing for certain classes or departments?
  1. Using Wolfram Language, perform some computational linguistic analyses on your writing:
    • How many unique words?
    • Select several words and make histograms of their placement in the text.
    • What is the average sentence length?
    • What is the average word length?
    • What percentage of words are stopwords?
    • Include the two analyses you performed on Project #1.
  1. Choose at least two other analyses of your own devising.  Draw inspiration from your classmates or from browsing the Documentation Center and/or An Elementary Introduction to the Wolfram Language.  These might include visualization techniques of the data (or segments of the data) you’ve already generated.
  1. Save your work in a Mathematica Notebook.  Use Headings and plain text to comment on it.

Part 2:

  1. Read “Semantics–What does data science reveal about Clinton and Trump?”  Identify the questions asked and arguments made by the writers, based on their data. How did this article change your thoughts about Clinton, Trump, and Obama? Or did it just confirm what you already thought?  What does this add to your understanding of the kind of analyses you can perform?

Part 3:

  1. Compose a 750-1000 word interpretation of your own writing, based on the results of your analysis. Your interpretation should consider the following questions, among others: What new information did your computational analysis tell you about your writing? If there wasn’t new information, how did the computational analysis allow you to look at your writing in a different way? What did this computational analysis miss about you as a writer?

Your annotated Mathematica notebook and your writing are due at the beginning of class on Tuesday, April 18.  In class that day, you’ll share your results with each other.  

~

Your work will be assessed on:

  • Your code submission:
    • Have I successfully completed the required commands?
    • Did I annotate and comment on my code?
    • Is the code easily readable?
  • A written reflection:
    • Is my writing persuasive and driven by a main argument or set of arguments?
    • Does my reflection contain thorough, substantial responses to each of the questions in part 3, thoughtfully using evidence to support points made?
    • Do I use the questions in the assessment as a starting point to come up with more questions, ideas, possible experiments, beginning thoughts about the nature of distant reading/digital humanities?

 [Link to PDF version of assignment]


Project 3: Independent Project #1 (Weeks 4-6)

Introduction

Digital humanities is a young field, and scholars are just beginning to explore the range of questions we can ask with the new tools we have at our disposal.  So far, we as a class have performed some basic analyses: we’ve calculated average word lengths and sentence lengths, we’ve plotted histograms of our findings, we’ve classified words based on parts of speech or sentiment, and more.  Having done this now both on Hamlet and on our own writing, and having developed a basic facility with Wolfram Language, we can now explore the world of literature and writing more deeply.  In this project, you have an opportunity to choose a research question and a body of text that are of interest to you.  What, if you looked at it through a computational lens, do you think would yield interesting and meaningful results?  We’ll start by considering the range of tools and the range of texts available to us…

Part 1:

  1. Consider the range of analyses you can perform:
    • Bigram, trigrams, and more: Deerfield alum Jonathan Harris created a website called “I Feel Fine,” that scours thousands of blogs around the world for the words “I feel,” and records the word that follows.  From it, the site develops a sense of the mood of the world (via blogs) at any given moment.  How might you do something similar with a public text?
    • Word proximity: How often does the word “life” appear in the same sentence as “death”?  How often does one appear without the other?  What other words or groups of words might you search for to see if they appear in proximity with others?
    • Finding patterns: When Hamlet says, “What a rogue and peasant slave am I,” he uses a rhetoric device called hendiadys, a pair of words connected by a conjunction used to explain a single thought.  Shakespeare uses hendiadys more in Hamlet than in any other play, in phrases like “the abstract and brief chronicles of our time” and “‘Tis sweet and commendable in your nature.”  How might you find patterns like this in other works?
    • Place Names: Professor Matthew Wilkins extracted place names from 19th century novels before and after the Civil War in order to understand if the geographic imagination of American novelists changed after the war.  What might you do with place names?
    • Mapping: What if you mapped the destination of every letter that Thomas Jefferson wrote?  What if you displayed that range over time?  What would that reveal about the communication patterns and range of the ideas that influenced this founding father?
    • Interactive Mapping: What if you created an interactive map of places in The Voyage of the Beagle?  How would that change the way you understood Darwin’s travels?
    • Word Frequency: Words used more frequently in comments for top students helps reveal traits of students who excel.  What else might sustain a word frequency analysis?
    • Plotting: A recently published work coming out of the Literary Lab plots emotional level and physical/abstract level of novels, placing two variables on one graph: the quantity of words that convey emotion, and the quantity of words that are concrete or abstract.  Which authors are both emotional and abstract?  Which are emotional and concrete?  Jane Austen, it turns out, inhabits her own space.  Where do you surmise this might be?  Where might other authors live?
    • Word Location in Text: Another paper currently in development looks at when the word “death” appears within a whole corpus of novels.  Visualized along a number line for each text, and compared across hundred or thousands of texts, this might reveal patterns in plot development across entire genres of novels.
    • Network graphs: Franco Moretti created hand drawn social network analyses of relationships in Hamlet.  These can be done computationally as well.  What networks between people might you graph?  What networks between words?  Think on that.
  1. Each of these analyses could be applied to a whole range of sources.  To prompt your thinking, consider what you might explore:
    • Novels: What questions might you ask about the fictional characters and stories composed over decades or centuries of time?
    • Memoirs: Consider the journals and autobiographies of writers, artists, inventors, or others.  How might you use some of these analyses to understand their thinking?
    • Drama: After considering ways to clean the data, what might character dialog data reveal over time?  How might you visualize relationships created by certain playwrights?
    • Journalism: What if you pulled geographic place names from movie reviews by the New York Times over the 20th century?  Or the past decade?  

These are only the beginning!  Take a spin around the library, what do you see?

DAY ONE HOMEWORK:

For our next class, your assignment is to re-read through this list of analyses you can perform, and then to take a stroll through the library, looking at individual texts or categories of texts.  Use the combination of ideas–tools and texts–to develop three different research questions that you think you might enjoy pursuing for your next project, for which we have three weeks.  In our next class, we’ll share our ideas with each other, talk with someone currently pursuing this kind of research, and refine our ideas down to one research question.  We look forward to your thinking!

Part 2:

With your research question in hand, and with a general sense of your methodology–what your data will look like, what you will do with it, and a first sense of how you might present your findings–it’s time to get to work.  These projects are due in two and a half weeks, on Monday, May 8, and we’ll spend both Monday and Tuesday of that week presenting our questions, methodologies, and findings to each other.  

To help organize your time between now and then, begin by solidifying the following items:

Final Question

What is your research question?  This should be a question that can sustain three weeks of research, interpretation, and exposition.

Estimate of Time

  • How long will it take to collect and clean your data?
  • How long will it take to run your analyses?  To create meaningful visualizations?
  • How long will it take to write and edit your paper?

Methodology

What steps will you need to take to acquire and clean your data set?  What kinds of analyses will you perform?  What do you need to do in order to perform them?  Note: your methodology can exist can be very high altitude, and it can be very low altitude.  

Possible Challenges and Workarounds

What obstacles are you likely to encounter?  How will you overcome them?

Part 3:

Your final submission for Project 3 will be different from previous projects.  In addition to submitting your data for this project, you’ll also submit your analysis and reflection together in one Mathematica notebook.  Typing an essay in Mathematica might be new to you, so please find on Canvas a Style Guide with recommendations for how you might format your document.  

This means that this project comprises two main parts:

  1. A clean data set, preferably in one document, but can be spread across several documents.
  2. A Mathematica notebook containing:
    1. An introduction to your topic and research question
    2. A walkthrough of your methodology and analysis.  This is where your code lives.
    3. A reflection and interpretation at the end in which you consider the significance and/or the implications of your analysis.

This work will be assessed on the following criteria:

Research components

  • Do you have a research question?  How deep is it? That is, is it a binary question or does it have the potential to lead to complex answers and further questions?
  • Do you have a methodology?  How effective is it at surfacing meaningful results?
  • Do you have a data set?  How effectively does the data set address the question?
  • Does your methodology effectively answer your question and provide a logically compelling arc?
  • Do you reach a conclusion?  How logically supported is your conclusion?
  • Do you acknowledge limitations of your research? How fully do you discuss further/future possible questions?
  • Does your research evolve over the course of your process? What new questions do you generate as you go along?  

Presentation

  • How clearly defined are the sections of your writing?
  • How clear are the annotations of your code?
  • How grammatically and syntactically accurate is the writing?
  • How organized is your notebook?  Does your analysis follow a logical arc and have a structure, or does it merely describe your process?

Growth:

  • Have you taken steps towards independence in asking questions?
  • Have you moved beyond your previous knowledge in your use of code to perform analyses?

We very much look forward to reading your work!  Final submission will be on Monday, May 8.

 [Link to PDF version of assignment]


Project 4: Independent Project #2 (Weeks 7-9)

Introduction

Phonetic fingerprints of rap artists, social network analyses of books in the Bible, geographic imagination in War and Peace, the presence of “love” in Beatles songs–we’ve only just begun to scratch the surface. This has two meanings: one, we are just beginning to discover the range of topics one can explore with digital tools, and two, we may feel that our analyses have looked at only a narrow slice of our topics of interest.  We have only cracked open the shell of computational analyses, and we have less than three weeks to go.  And so, this sets the stage and context for our final project.

Having seen a few of your peers’ projects, having just warmed up to your own work, where would you like to go now?  What would you like to explore?  Let’s start by taking stock of what we have done.

Part 1:

  1. Reflect on what you have done so far.  In 200-250 words, answer two questions: 1) What has your work in class this term enabled you to do that you have found most compelling?  And 2) What have you found most compelling in what others have done?  Take stock of what you have learned and reflect on it here.  Share how your topics and/or methodologies–or those of your classmates–have informed your understanding of the digital humanities, and how these topics and methodologies can offer insight into your areas of interest.

Part 2:

Part 1 is the warm up.  Having now thought about different topics and methodologies, what would you like to do next?  Consider several options:

  1. Expand on your current project by taking a new, deeper approach
  2. Develop data visualizations based on a new or existing topic
  3. Conceive of a new project altogether
  4. Take up a classmate’s project, and expand on the work they have done.

In each case, this project asks you to take a new digital humanities skill.  Use a new tool.  Approach an analysis in a new way.  Explore a topic with a methodology that is new to you.  Do something you haven’t already done.  What will you do with this charge?

For tomorrow (Thursday):

At the end of your 200-250 word reflection for Part 1, write a proposal for what you will do next that answers the following questions:

  1. What is the title of your proposal?
  2. What question will you explore?
  3. Why do you want to pursue this topic?
  4. How will you decompose the problem?
  5. What is your anticipated outcome?
  6. What will be your timeline?

Type up your responses to all these.  We will collect your proposals tomorrow in class.  Here are a few notes:

  • You may not include sentiment analyses as part of your proposal.
  • Nor may you rely on single word frequency analyses (appearances of individual words).
  • You may collaborate on this project.  If you do, you must answer a seventh (two-part) question above: 7. If you are collaborating, how will the scope of the project be larger, and how will you divide or manage the work so that the partnership will be equal?

This project is an opportunity to expand on what you have done or seen, or to try something new altogether.  You have your feet underneath you now in regards to Wolfram Language and working with data.  You’ve seen some of your classmates’ work.  What will you try now?

Proposals are due in our next class, Thursday (5/11)

Final Projects are due and will be presented on Monday (5/22)

[Link to PDF version of assignment]


Go to Highlights from Student Work

Advertisements