Hunter Corb is a second year Master of Library Science candidate at UNC’s School of Library and Information Science. With a focus in archives and rare books, Corb is currently working on a digital humanities master’s paper to finish up his degree. The project is a social network analysis investigating the role of women in the 17th-century London book trade. Corb describes his project, process, and obstacles he’s faced along the way below.
1. What is your project?
I am conducting a social network analysis of the London book trade during Revolutionary England (1641-1661). This is mostly an exploratory study in which I am using statistical analysis in the hope of both confirming (or push back against) existing scholarship and identifying communities and potential influential figures for further research. Additionally, I plan to use random walk techniques to try and pinpoint missing links in the network as it currently stands; in so doing, my hope is to identify areas in which women, whose work in the book trade has historically been silenced, were vital and influential figures.
2. How did you conceive of this project?
The idea for this project first came to me in an English digital humanities class that I took last Spring here at UNC. We spent two weeks on social network analysis as a methodology, and it struck me as odd that no one had yet attempted to use social network analysis to analyze the book trade. (As I continued on in the project, I began to see why.) Thus, the idea for the project was born.
3. How did you design the project?
I spent a considerable amount of time, probably much more than I needed to, trying to identify potential sources of data and what bits of information pertaining to each entity were pertinent. Did I want to map their geographical location as well? How was I to deal with the time differences in years active? And, after I had put together a dataset of nodes, where would I get the information necessary to establish the necessary edges, the relationships between them? I ultimately decided that a smaller and more niche project was better than trying to map and determine influences in the book trade as a whole for a variety of reasons. One, in dealing with a very real social network which encompassed an incredible variety of trades, I could not very well map all of them as each were different communities, made up of different kinds of people with their own influences and interrelationships. The engraving community did not have the same characteristics as the typographic community, which in turn was not the same as the printing community (and the former are not really a part of the core of the book trade as we think of it now), yet all influenced and were influenced by each other and were instrumental, albeit sometimes indirectly, in the overall functioning of the book trade. Thus, I decided to eliminate all such ancillary communities in favor of the main ones involved in the core of the trade: printers, stationers, and booksellers. I ended up finding a few datasets to work with, one being the London Book Trade Index which Dr. Ian Gadd so graciously let me use, and the other is Maureen Bell’s preliminary list of women in the book trade from 1557-1700 (filtered for my particular dates, of course). I also decided to use an Open Source software called Gephi to create a dynamic visualization of the network, one which has embedded in it statistical measurements and equations, as well as a timeline feature. For establishing the edges between nodes, I decided to use relationship attributes so I could filter by relationship types, these being familial, master/apprentice, and purely business ties. Some of this information was in the two dataset sources I mentioned above, but the rest came from the Stationers’ Company Registry.
4. What obstacles have you faced?
As always when working with a historical data, there is an inevitable case of missing data. For example, since women were generally silenced in the historic record, we have no way of knowing for sure the full extent to which women participated in the trade. Additionally, though stationers were encouraged, and at various moments forced by law, to register their copies (titles and right to print) in the Stationers’ Company Registry, this was not always the case, and in fact happened more often than not in Revolutionary England because a good portion of what was printed was illicit and seditious. The biggest obstacle that I have had to face while working on this project, however, has been the difficulties of working with someone else’s data and having those sets of data speak to one another. Though not impossible, cleaning the dataset, including supplying some missing information and expanding (and making sense of) personal abbreviations, has proved a much more time-consuming task than I originally thought. Then, once I finally had the final dataset that I was going to work with, I found that there was a lack of compatibility between the labels and data types that I had pulled from the database I was working with and Gephi, so I had to go back through my data again and crosswalk the labels to get something that I could work with. Planning the project ahead of time is one thing, but I found that no amount of planning can completely prepare you for the realities of creating and executing such a project as this.
5. What was your process? What skills or technologies have you used?
I pulled my data from an Access database and use the query feature in Excel, which I was more familiar with, to merge and filter the data to get what I needed. I also used Gephi as the Open Source platform for creating and statistically analyzing my network visualization. I touched on my process a little bit before, but I will spell it out more clearly here. I first conducted a search to determine where I was going to cull my data from, and then once I knew what I had to work with I set about designing the parameters of the project, i.e. what information I would include, what to exclude, and what exactly I could look for and analyze with the data that I had. Next, I extracted the data that I needed from the larger dataset and, after looking at Gephi and its requirements, I cross-walked the labels from my dataset to match the ones in Gephi so I could more easily keep the information pertaining strictly to my nodes from the relationships between them which would eventually form the edges. I then uploaded this dataset and created a basic network visualization. Next, I went through the Stationers’ Company Registers to create the business partnerships that would form the other relationship type in my edges (keeping in mind, of course, that there is potentially a wide range of missing data, both from copies which were not registered and business relations through collaboration which may have been conducted without documentation). It was beyond the scope of this project to go through various printed books systematically in an attempt to map more completely the business partnerships in the London book trade, so I decided to stick with master/apprentice relationships and collaborative registration of copies. Finally, after creating a final network, I ran statistical analysis on it to see what kind of conclusions I might be able to draw from it. I have already run centrality and eigenvector measures, but as I am still in this stage of the project there is much more left to do.
6. What have you learned through doing this project?
There are two big takeaways for me through this project, besides what I learned about myself as a scholar (and what conducting this kind of a project entails). The first is that digital humanities projects, by their very natures, are collaborative. Though I still have some finishing touches to put on it, I could not have gotten through it without the aid of a number of people, all of whom gave me good advice on how to shape my project and helped answer my myriad of questions. I learned part of this the hard way: by trying to do it on my own in the first place. There was too much to learn, too many aspects of the project that I was not familiar with and had no background in, and at some point along the way I had to concede this to myself. Coming from a liberal arts background, conducting research individually and in isolation is a given; rarely, if ever, does something become a collaborative project, at least for traditional scholarship methods. However, digital humanities projects are the intersections of a variety of topics and skills and will almost always require the expertise of more than just you to do it and to do it well. We must be open to this kind of collaboration with others, able and willing to listen to the expertise of others even as we also lend our expertise. The second major thing that I learned through this project is that digital humanities cannot be done for the sake of doing something digital; there must be a particular purpose behind the individual’s choice of method, and there must be a reason for using a digital method over a more traditional and analog one. Otherwise, what is the point? This is not a new idea (indeed, many digital humanists have espoused the same in recent decades), but it took conducting a project of my own for me to realize the full import of that statement.