White Paper for Website, “Visualizing Education”


“Our Little Folks”: Typical School Book of the Time Period 1865-1910

What affected American public school books from just after the Civil war until the outbreak of World War I? What impact did these forces have on the actual texts of these books? These are the questions my website, Visualizing Education http://matthewtkeough.omeka.net attempts to answer using close reading, text mining, and visualizations of textbooks published from 1865-1910. This white paper will describe the website and how it could be improved, the audience, how it is innovative, how text mining and visualizations are an appropriate use of technology to explore this subject, the accessibility features, and how this site can be built upon by others. I will also explain the decisions I made in designing Visualizing Education and the lessons I learned through this process.
Visualizing Education has three main sections, the homepage, an archive of 18 full text versions of public school books from 1865-1910, and a visualizations page. The homepage lets the user know what questions the website is trying to answer and gives a brief introduction to America in the time period under consideration, which is Reconstruction and the Gilded Age. The homepage also introduces the user to the major themes of the school books and describes William Holmes McGuffey, the most famous and influential textbook authors of this time. Logistically, the homepage also gives an introduction to the website and explains the purpose of each tab. The homepage turned out well. It is very clean and a good starting point for the user. It does have a lot of text but it introduces the website and the other pages are less text heavy. The “School Books” tab leads the user to an archive of the 18 school books used. The first part has the title and mainly just the publication information. I tried to keep the text out of this section but was not always successful. If one clicks on the title it leads to the Dublin Core metadata with the publication information clearly displayed. It also has the full text of each of the school book. Sometimes the line and paragraph formatting was not ideal but I thought the main goal was to enable users to create visualizations and text mining outputs and this format serves that goal well. In addition the location of each publisher is listed on a map along with the other Duplin Core metadata. I chose to use an archive style format for this part of the website because the texts were the most important aspects and they do not have images to display. This format has buttons to tell the user what page they are on unlike if I just created an exhibit page. This is important to make sure the user knows there are more school books on the next page. The last section of the Visualizing Education is the Visualizations tab. This section is comprised of the visualizations that I created. I used my previous research to come up with some of the ideas for the visualizations but, in addition, I also came up with some new areas of research that this site enabled me to conduct. I chose to layout this part of the website as an exhibit because this format allowed me to display both images and text. The ability to display images was vital to this section since it dealt with visualizations, maps, and text mining results.
Even though this website does a lot well there is also a tremendous amount of room for improvement. The website does an excellent job of making these texts available in a format that is easy to use for the user to create their own visualizations. It also displays the visualizations that I created in an effective manner, which confirm previous scholarship and, in some cases, create new areas of research. The main area for improvement that I just did not know how to do is to create a plugin for Voyant directly into the archive of textbooks on my website. Ideally, there would be a section of the websit where you could just check off the texts you want to research, click a button, and it would take the user directly to the Voyant text mining page. An even better choice would be to use the R programming language to be able to do the text mining and visualizations all within Visualizing Education. Having this plugin would make the user experience better and eliminate the need for the long instructions on the homepage. In addition, I had hoped to format the website so there would be a small picture of the visualization on the main Visualization tab website as an introduction. However, I could not figure out how to do that within the given amount of time. Finally, I would have also liked to include more text to get a better appreciation of how the content changed over time. However, I was limited by the amount of texts that were already available online through the internet archive or Project Gutenberg. There is enough text to get a general picture of the time period but not enough to see changes over time within this period. In addition, I would have liked to have a section where people could upload their own textbooks from this time period. I do have a small “Contribute an Item” at the bottom of the site but I would like to make this more prominent and easier to use.
The target audience for the website is high school students who would like to get an introduction in digital history as well as school books during Reconstruction and the Gilded Age. The subject matter would appeal to them since school children can relate to textbooks even if they have different content in them. This would hopefully be helpful in having the students think what are the differences between their textbooks and the school books on Visualizing Education. The site is set up to be easy to use. The visualizations that are already produced give the students an introduction as to what is possible. My instructions tell the student users how to create their own inquiries using Voyant. This will hopefully make them more involved in the project.
Visualizing Education fits in with previous scholarship but also has an innovative approach. Other books like Ward McAfee’s Religion, Race, and Reconustruction: The Public School in the Politics of the 1870’s and Dolores Sullivan “William Holmes Mcguffey: Schoolmater to the Nation” and other books talk about similar issues how the textbooks deal with race, gender, and issues of morality, industry, and patriotism. However, these studies use as evidence just one or two of the selected readings to prove their point. They do not look at the entire text of all of these school books to look at general trends. Visualizing Education is innovative because it looks at the same content but in new ways using text mining and visualizations to confirm old claims and to investigate new areas in an easy to use format. This also shows how the website is a good use of the technology. It is not just a PDF version of these texts. It enables users to use new technology to explore this content in a new way. The user gets a broader sense of the content and is not limited to close reading of a limited number of school books.
The archive portion of the website makes Visualizing Education a perfect tool to be built upon. In the future, users will be able to not only submit their own content but, right now, they can create their own visualizations. They can easily browse to the book or books they are interested from the “School Books” tab and copy and paste that URL into Voyant to obtain their text mining results. The visualizations will give users ideas on the things they can do with these tools. In addition, the texts are all past their copyright dates so there is no additional steps or regulations people will need to take into consideration in order to use the books on their own.
My website will be accessible to everyone with an internet connection and comply with the American with Disabilities Act. The site is free to use with no membership fees. I have attached a PDF copy of all the texts where this was available. This will enable people who are partially blind to input this into screen readers so they can at least read the contents of these textbooks. I have also placed my contact information on the website so people can email me if they have any issues with using this website.
There were many decisions that I made in order to improve the usability and historical scholarly level of this website. I placed an introduction on the homepage of the website to give the reader an overview of the major themes of the time period. I also included all of the text so people could do close reading of the text themselves and not just rely on my visualizations. The texts were already online but they were not in the best format and there were many errors included in the texts. Thus, I had to make a lot of decisions about how to clean them up. I found Project Gutenberg’s HTML file was the best quality so I used this website whenever possible. When that was not available I used the Daisy file in the Internet Archive. I then used the HTML file, which still included many errors. I decided to do paste the contents my notepad on my computer, remove all the formatting, and do a universal search for any extra spaces, erroneous text like that referring to Google, “-“ symbols where a word was broken up by a line, and any symbols that were mistakenly outputted by the Optical Character Recognition OCR software. I got rid of the formatting because I was experiencing a problem uploading these huge amounts of texts into Omeka unless I got rid of the formatting. I replaced the erroneous symbols and extra spaces with an empty space to help improve the text mining results. I also used the following website http://textfixer.com/tools/remove-line-breaks.php to get rid of the extra spaces that were created in the OCR’ed document. I may have gotten rid of some real line breaks but I thought this would be worth doing because the words were the most important thing for text mining not the paragraph breaks. I also included PDF copies of the text so users could see the contents in their original form. I also made the chose to include just copy and pasting the URL into voyant and not the text. I realized this included some of the text from my website but I thought this would be worth it for ease of use.
There are several lessons that I learned from producing this website that I will use in the future. I will always remember to leave extra time to complete online projects. I found Omeka was able to do a lot of things but the process to complete a task was not always straight forward. So it took me longer than I expected to do pretty much everything with regards to the website. I also learned, first hand, that one is restricted in their choice when you are using a pre package software. For example, I would have liked to use the original Omeka home screen but it included the Recent Items that I could not get rid of. So I had to switch to another homepage. That being said I would not have been able to do the majority of the things that I was able to accomplish without using Omeka. So I do appreciate the benefits of using pre package software as well.


