By Mark Hibbett
This paper will describe the process of generating a corpus of comics for an examination of the transmedial development of the character Doctor Doom during the period known as ›The Marvel Age‹. It will briefly define what ›The Marvel Age‹ means in these terms, and describe the rationale for choosing which items should be included in the corpus. It will then go into some detail about the use of online comics databases, notably The Grand Comics Database, and describe the many difficulties inherent in the use of a dataset that has been collaboratively generated over a long period of time without clear editorial guidance, and suggest data-cleaning methods by which these issues can be mitigated. Finally, it will discuss how this corpus will be used in future to analyse the progress of Doctor Doom’s characterisation through this period.