What data to gather?

When first approaching the project a decision needed to be made about what would make up the data of the project. While it was clear that I would be using Keywords and topic modeling to explore the documents, what are the documents that will be explored?

Since it’s inception in 1949, the American Quarterly changed it’s format a few times, but regularly published a combination of articles and book/event reviews. Over the years the reviews were presented in many different ways, sometimes showcasing publications based on themes, other times taking individual books or events and reviewing them in the context of American Studies, and some issues did not have any reviews at all. This presented the problem of do I add these reviews in with the hope that they add more insight into when topics were being talked about? Or have the fluctuation in the number of reviews (knowing how topic modeling is based on modeling individual articles) over or under represent certain topics

Since the intention of why the journal had reviews changed so much over the years and with the added outside opinion from my capstone mentor that the reviews were not good indicators of what was actively happening in the field (but might be interesting to look at in the future), I decided to not include them in the data. This left me with just the core articles of original research that made up the journals.