Literature and Data
- Sarnav
- 4 days ago
- 6 min read
An aesthetic and statistical approach via “Herland”
The perception of literature is abstract and intellectual. At least most of us have this attitude. Moreover, we do not find this strange. Analytical and rational form is hardly expected. But poems can have layers, can't they? If we write with certain forms or restrictions, we can enjoy different styles. We can create a completely different image and structure. As another example, at the end of a book we can make logical enquiries. At such moments, even if we are not aware of it, the situation can become mathematical.
I had previously written an article with a similar idea, analysing the relationship between the frequency of punctuation marks in literature and the emotion in the narrative.
With a similar impulse, I wanted to discuss the book I had read for our book club, Herland. But this time I also made a visual design.
Let's talk about it without going into detail: *Herland is an example of a feminist utopia. Three male characters enter an all-female community and learn and teach each other.
I was thinking about which characters play how much of a role in this context, and it occurred to me to put it in context. But I wanted to approach it from an aesthetic point of view as well as a statistical one. I created two simple works by addressing two different questions.

Photo by Niko Nieminen on Unsplash
First of all I must say that I have no talent or skill in the field of design and I am learning the process slowly and pleasantly, working by hand as much as I can. Since I have no skill in this area and don't trust my imagination, I wanted to see what I could do by copying the designs of another designer who appealed to my eye. Imitation is one of the best ways of learning.
The first of my works was one that was pleasing to the eye and that I enjoyed making. I concentrated on the twelve chapters of the book and wrote down the first sentence of each chapter. I did not refer to the physical copy of the book, but to the digital material (and another edition).
For each of the twelve chapters, I thought about what terms the words corresponded to, and I also used artificial intelligence to help me. Accordingly, I categorised each word into a total of eight terms, such as noun, conjunction, preposition, adverb, verb, etc. Of course, there are some words that do not correspond exactly and correspond to terms such as verb-phrase, adjective-verb, etc., but I did not worry about these because my aim was more to make a visual preparation.

The star chart design I prepared for “Herland - Part 1”
To illustrate, I took the first sentence of the first chapter, identified the terms in each of them (ignoring the first word), and then followed the instructions at the top left to create a random “star map”. I did not follow the instructions at the top right because it was too much work to organise long sentences.
After doing the same for each section, I wanted to put them together into a complete “star cluster”. This is what they look like as a result.

“Herland” - Star cluster design that I prepared by combining all parts
The result for my first work is as you can see. It has no literary impact and the interpretation we can draw from it is very limited. Besides, I don't think it would be logical to make an evaluation from the first words of each chapter.
It was just an experiment that I did with enthusiasm because it was visually interesting and related to literature. I would have preferred to do the chapters of every book I read in this way (visually much better, of course) and turn them into posters. I don't need to tell you how tiring that is. (Maybe our friends who know how to code can do a better job). But I liked the idea. Such designs, apart from the book itself, could be a sweet reminder of the time I spent reading.
Then I thought it would be appropriate to do a similar study in a statistical context. I thought that at least some interpretation (in line with the subject) could be introduced and perhaps some additional evaluation could be added to the book.
It would be right to deal with a subject that could be collected, interpreted, visualised and made logical. So I started to work on the length of the dialogues.
This time I worked entirely with artificial intelligence. First of all, I sent it the digital copy I had and asked it to indicate the characters and their dialogues. In which chapter, which character speaks how many times in total / how many times he/she utters sentences.
The amount of data to process naturally creates a margin of error. Although the characters' dialogue is marked with quotation marks, they sometimes express their thoughts with gestures, and sometimes they can interrupt their dialogue with two sentences. Sometimes the AI can ask if it should be considered dialogue that contains only exclamations (such as 'wow wow wow!'). Of course, I ended up saying yes to all of them. So of course there is a margin of error in the output.
I let the AI compile the chapters one at a time until I was halfway through the book and moving slowly. However, I changed my method both to avoid possible digital errors due to format differences and to get more accurate data in the original language. I had the book I found online processed using the same instructions. However, not wanting to trust the AI too much, I only asked for the graph of one particular section. It did not take long to see that it made mistakes when I asked for the full data. It seems we have a long way to go with artificial intelligence.

“Herland - Part 6 Dialogue Lengths” Table
So I asked the characters who spoke in the sixth chapter to make a chart of how many words they used in their sentences in the order they spoke. To be honest, I'm not sure it's still effective data. Short sentences can tell you a lot.
Nevertheless, I found this table interesting to see how often a character interjected.
Then I thought about how it affects the book as a whole. If we take into account that there may be a margin of error in low percentages, we get more or less certain data. If we look at the performance in terms of the book as a whole, rather than the chapter, it seems more appropriate to comment on it.

“Herland - Total Speech Percentages” Pie Chart
In this context, I was able to reduce my confusion about the contribution of the characters (except the most talkative one) to the dialogue. Seeing the percentages of the others helped me to remember what was said in which chapter. It allowed me to do a quick scan in my head, unexpectedly.
Of course, we can easily read other visible data. We see that men take part in more than half of the dialogues. Of course, you have to read the book to interpret this in detail. But it is not difficult to say that the character marked in blue is a dominant character. It is obvious that he is either in too many chapters or that he keeps his dialogue longer than the average.
It is also interesting to note that six different female characters are named, while a certain trio is less common. Perhaps the actions of these characters speak louder.
The first of the most important points here is undoubtedly the subject to be worked on, then the design itself. As I said, I am not even a beginner in this area, so I have gone with the first points that come to mind. You may have noticed that there are other points that can be emphasised while reading a book or studying this article.
But it can be useful to build up such designs and blocks of data, both in terms of their appeal to the eye and in terms of revealing aspects that might not be noticed at first glance.
Commentaires