When I was at secondary school, we sometimes used to have serious discussions about the longest word in the English language, alleged there to be ‘antidisestablishmentarianism’.
This word might be parsed as follows.
Disestablishment was about doing away with the Church of England, the established church, the church of the establishment. The church which provided the bishops who sat in the House of Lords. The church presently headed up by our Queen. The church which, until say 1950, a significant number of people attended on at least an occasional basis, perhaps as many as 5% of the relevant total in 1950. Members of other churches were inferior beings, tolerated rather than welcomed in places like the House of Lords, the colleges of Oxford and the clubs of St. James’s. Plenty of these inferior beings were all for taking away the special standing, the special status of the Church of England, a proceeding known as disestablishment.
A supporter of this proceeding was known as a disestablishmentarian.
The creed of disestablishmentarians was disestablishmentarianism, by analogy with Catholicism, Calvinism and Marxism. With ‘ism’ often being used in this way in these ecclesiastical or political contexts.
And the creed of the people who set themselves up against disestablishmentarianism was antidisestablishmentarianism. Seemingly an acceptable double negative, although I don’t remember any school debate on that point.
As an aside, I note that in my day there used to be established and unestablished civil servants. The latter were, once again, inferior beings with inferior terms of employment. Sometimes they were allowed to progress to the ranks of the established – a usage not that far from that of the church.
This revived interest in long words being prompted by perusing reference 1, then reference 2, in which last I learn that the Seneca language contains a lot of long words, fusing what in English would be a lot of words into one large verbal form, so making Seneca what is called a polysynthetic language. Once widely spoken in the north western part of what is now New York State. Somewhat to the west of the area we visited back in 2014, for which see, for example, reference 5.
The present point of interest being what exactly gets into consciousness in the course of articulating these long words. Which I thought might be served by thinking of an English example, hence ‘antidisestablishmentarianism’. A word which makes it to Wikipedia, but, curiously, not to my edition of OED, where there is nothing between ‘antidinic’ and ‘anti-division’. Perhaps the compilers of OED felt themselves to be above such tiresome disputes, even back in 1888, when the ‘D’ volume was put together and when I imagine that these disputes were a lot more important than they are now, a time when there were plenty of men and women in the streets who actually cared. In what follows we shall suppose that the word does exist, and was once actually used in the intended sense, as opposed to being a mere curiosity.
Parsing affixes
Let us suppose we build a long word with prefixes and suffixes (collectively affixes) along the following lines:
(root)
(root + suffix)
(prefix + (root + suffix))
(prefix + (prefix + (root + suffix)))
And so on, where, in the first instance, any pattern of alternation between prefixes and suffixes is permitted. This freedom will soon be circumscribed by rules about the circumstances in which any particular prefix or suffix is appropriate. So, for example, the suffix ‘ism’ is not appropriate to the word ‘disestablishment’ as the latter is an action, which does not usually give rise to creeds, at least not directly. It also sounds a bit odd – which may be just me, or may actually have some role in the life history of words – on which see below.
When we write the word we do not use either brackets or plus signs, still less when we say the words, the various components are just strung together, one after the other, in this last case perhaps helped along by giving it some rhythm. Our present problem is that (prefix + (prefix + (root + suffix))) becomes (prefix + prefix + root + suffix) which might also be interpreted as ((prefix + (prefix + root)) + suffix) or (prefix + ((prefix + root) + suffix)). If we were dealing with numbers rather than roots and affixes, all these variants would be the same because the operator ‘+’ is what is known as associative. (A + (B + C)) is the same as ((A + B) + C). Which might be a consequence of the way we define numbers and their addition or which we might regard as an axiom; just something which we agree to be true, without digging any deeper.
This is not a problem when a word is built from a root using just prefixes or just suffixes. Then is there just one way to unpack the word. The problem arises from being able to choose whether a prefix or a suffix is to be taken off next. If, for example, we have one prefix and three suffixes, there are four possibilities.
We put aside the complication that in many languages a root is modified by addition of an affix, perhaps in the interests of making it easier to say the resultant word, taking into account the mechanics of the mouth, the tongue and the vocal tract.
Summary
All this might be summarised by a rule to generate words from roots and affixes:
<word> ╞ {<root> | (<prefix> <word>) | (<word> <suffix>)}
Where the ‘|’ denotes a choice and the brackets serve to preserve the order in which a word has been built, the exact alternation of prefix and suffix.
We insist on affixes being added one at a time, on there being an order of addition. We do not allow a prefix and a suffix to be added simultaneously, that is to say, in the jargon used above: ‘(<prefix> <word> <suffix>)’. I have no idea whether there are languages which do this.
In what follows, ‘+’ stands for the concatenation of a word with an affix. So ‘a + b’ is a way of structuring, is a fancy way of writing ‘ab’ and ‘(a + (b + c))’ is a fancy way of writing ‘abc’. But remember that ‘(a + (b + c))’ is not the same as ‘((a + b) + c)’. We are not talking about numbers which associate over addition and where they would be the same: (1 + (2 + 3)) is always 6, even if one has it as ((1 + 2) +3). So one might just as well write (1 + 2 + 3) and forget about the extra brackets – which is not an option here.
We also have some grammar built around a correspondence, Φ:
Φ ╞ <word> → <type>…
Where type might be something like occupation or plant, a collection of not necessarily exclusive categories, and where ‘…’ means zero, one or more. So a word might map onto zero, one or more types. A correspondence which is coupled with various rules about which type-affix combinations are possible and about the type of those combinations.
A second parsing our long word
In what follows, the words marked at the end with an ‘*’ appear in the OED. Which, not being marked as antique or obsolete, might be supposed to have been in use around 1888 when it came out. And in each case, it was the establishment – or not – of the church which was in question although other meanings were theoretically possible and might, indeed, have become current had the words been more successful than they turned out to be. The words which did not appear in OED might well have appeared in newspaper or conversation at that time, but were presumably not considered sufficiently current, sufficiently strong to warrant inclusion in the dictionary.
(antidisestablishmentarianism)
Option 1: break out the suffix ‘ism’ which makes a creed from a famous person, like Marx, or a group of people, in this case the antidisestablishmentarians.
(antidisestablishmentarian + ism)
But discarded because the establishment of the church is the status-quo and its supporters are unlikely to active and stroppy, to be inventing any new doctrines, any new ‘ism’s. They just react to those who do want to disestablish the church and whom they don’t much like. Who do invent new ‘ism’s.
Option 2: break out the negative prefix ‘anti’, permitted in various combinations. So ‘Anti-Dühring’ is the name of a book by Engels opposing the views of Dühring. ‘Antimatter’ is a sort of matter, not that long invented, which is opposite to, a sort of mirror image of regular matter. ‘Antisemitic’, usually of a person or persons hostile to or prejudiced against Jewish people. With ‘antique’ serving as a reminder that care is needed. Probably the same Greek or Latin root, but here meaning before rather than against. In any event, in this case the status-quo people are being polite, being against what the disestablishment people believe in, rather than the people themselves.
(anti + disestablishmentarianism)
Now break out the suffix ‘ism’, separating the disestablishment people from their creed. With disestablishmentarian rhyming nicely with parliamentarian. Which gives this new word a bit of standing, illustrating the creation of words on grounds of sound and syntax, rather than of sense and meaning. A word which does appear in OED.
(anti + (disestablishmentarian* + ism))
Break out the suffix ‘arian’ which makes a person out of a (usually political or religious) position. Disestablishment is what is wanted for the reasons encapsulated in the creed of disestablishmentarianism. An action which follows from a set of beliefs.
(anti + ((disestablishment* + arian) + ism))
Break out the suffix ‘ment’. To disestablish is an activity, an action. Disestablishment is the result of that action.
(anti + (((disestablish* + ment) + arian) + ism))
Break out the negative prefix ‘dis’, which reverses the sense of what follows. In this case to undo what establish does.
(anti + (((dis + establish*) + ment) + arian) + ism))
Noting that while the last two roots are verbs – establish and disestablish – the others are nouns, as is the long word itself, giving us in reverse order: verb – verb – noun – noun – noun – noun. Seneca admits superficially similar constructions, but which are actually much more complicated.
Noting that one might argue about this. With some long words, not particularly this one, there may be more than one perfectly plausible and reasonable way of doing this parsing, doing this alternation. The deconstruction of a word is not necessarily well-defined.
Noting that a word may be successful, may become current, not because it makes particularly good sense, or because it is well constructed, but because it sounds well. It looks well in the newspapers. It caught the fancy of a popular journalist, or the editor of a popular journal.
What is perhaps relevant here is that the parsing proceeds from both ends, cutting out things both from the left and the right. One cannot parse the word as it arrives, one has to wait until the whole word is available, not that one does that after the first few times one encounters it. Perhaps analogous to the problem with German sentences, which may not make much sense, especially to a non-native speaker, until the verb or the verb root, often at the very end of the sentence, turns up.
It is perhaps even true, that the compilers of OED got it about right. The brain can cope with three affixes, with disestablishmentarian just about included in the relevant entry – but gives up after that. The longer words exist, perhaps as ink-horn words in Jacobean parlance, but do not really work in the way intended and only exist among scholars and their books. Words in which the form has outgrown the function. The messenger has outgrown the message. Or, as we are reminded from time to time, the point of press officers is to deliver the news, not to become the news.
From where I associate to a common problem in our world, where the trappings of something outgrow the substance. Of which the English monarchy is a curious example, with the trappings and flummery being out of all proportion of the monarchy’s role in day to day affairs - but with the monarchy nevertheless still fulfilling a function, even with its shell being more or less empty. With the UK being another, with our prancing and posturing on the world stage getting more and more out of touch with the reality that, for example, that the German economy is a lot larger than our own, that is to say half as big again.
Saying the word, being aware of the word
I try saying the long word to myself, that is to say without articulating it out loud. It takes a few goes to get it straight. And although I can say what the word means happily enough if prompted, I am not sure if anything other than the business of saying the word, than savouring the word itself, gets into consciousness otherwise. Perhaps it would be different if I had a real use for the word. If I was, for example, a committed disestablishmentarian, spending quality time debating the matter. Actually knowing all kinds of odd people calling themselves disestablishmentarians, or perhaps even antidisestablishmentarians. This would give the long word some proper baggage, perhaps short-hand, emotional baggage; baggage which consciousness could make a selection from. Whereas the only baggage I have now is about long words as long words. The goings on in the Press Office, rather than out in the real world.
Another consideration is the length of the word, with experiment suggesting that it takes something under two seconds to say this one out loud, a number arrived at by saying it out loud, but quietly, five times and timing oneself with the Microsoft clock. Which figure seems to be reliable enough for present purposes, despite a tendency to play with the word, to say it in different ways. And sometimes getting stuck on it. I imagine that it can be read silently much faster, particularly when the word has become familiar. But perhaps it is too long to get into consciousness, all at once. I note in passing that two seconds is also hypothesised to be the average duration of a frame of conscious of LWS-R of reference 11.
In any event, all the other stuff about parsing outlined in the previous section, while true, does not get into awareness, not into my awareness at least. Playing with the word yes, understanding its construction, meanings or associations no.
There must be some interaction here between consciousness and working memory, with the long word winding up in working memory and consciousness being a sort of window sliding around that word. With understanding of such a word being made much easier if working memory is expanded by printing the word on paper or writing it on a white board, where the eyes can scan it at its leisure, bringing unconscious processing resources into play, delivering results into consciousness from time to time. From where I associate to Chater and his argument that there is not very much at all in consciousness at any one time, for something of which see reference 9.
Hills
If I say ‘the man went up the hill’, I am conscious of the words. I can explain what they mean is prompted. But just presently, I am not conscious of anything about the man or the hill. There is not conscious image of either floating about. The conscious mind seems to be happy to operate at the level of the word. Perhaps in the sense of reference 7, we are all children most of the time, working at the level of the word rather than at the level of whatever it is that the word is grounded in. Just playing with words.
Which might sound a bit feeble, but one might add that that is the point of well chosen words. That, to some extent at least, one can put the real world aside and playing with words is good enough.
If I see a man going up a hill, most likely there will be no inner thought about it at all, although I may be able to answer questions about it. And if there was inner thought in words about the action unfolding in front of me, who knows what it might be – with ‘the man went up the hill’ being just many of the things that one might say. I might, for example, be much more interested in the colour of his duffel coat than in the fact of his going up the hill.
So about as determinate as the first problem, that is to say not very determinate at all, visualising the right man going up the right hill from the bare words. Or perhaps drawing the man going up the hill. Or being asked to improvise some elaboration of the bare words, the bald statement.
The matter itself
Given that I come from a family of atheists, not believing in any church, never mind the Church of England, I thought it right to include a few comments on the matter itself: is it right that this church should occupy a privileged position in the land?
And as a former statistician it also seemed right to start with the statistics, with the Church itself publishing quite a lot of stuff, including the top two panels in the figure above. The left hand one from the splendid document from 1962 at reference 10, at that time available for one guinea, now downloadable for free. All kinds of fascinating stuff, especially statistics about the clergy rather than about their customers – these last being rather more visible thirty years later in publication from which the right hand panel is taken.
The answer seems to be that nowadays around 1% of the population attend the Church of England on a Sunday in a regular way, well under half the Christian total and quite possibly well under the attendance at mosques on a Friday. So no case in the figures for a position at all, never mind a privileged position.
And while the power of this Church is not what is was, and many abuses – such as ownership of slum housing – have been dealt with – the church has a far bigger role in the education of our children than I care for. Not helped by Past Master Blair’s enthusiasm for faith schools of all sorts.
But in the round, like the monarchy, a bit of an antique furniture, not worth all the fuss it would take to get rid of it. More important problems to spend political band-width on. So I am not a full-on disestablishmentarian and I do not subscribe to disestablishmentarianism.
Conclusions
An interesting window onto the interaction between language, awareness (or consciousness) and working memory. A window which suggest that there is less of interest going on in consciousness than might at first be thought. A window which might well merit a bit more work.
Postscript
The modern use of the word ‘antisemitic’, mentioned above, is itself curious, given that ‘Semitic’ is properly used to describe the group of Middle Eastern languages which includes Arabic (300 million), Amharic (20 million), Tigrinya (7 million), Hebrew (around 5 million), Tigre (around one million), Aramaic (less than 1 million) and Maltese (half a million). Aramaic being the language which superseded Hebrew, before being itself superseded by Arabic. Probably the language of the first Christians. This from Wikipedia.
References
Reference 1: Sequences of Intonation Units form a ~1 Hz rhythm - Maya Inbar, Eitan Grossman & Ayelet N. Landau – 2020.
Reference 2: Seneca morphology and dictionary – Wallace L Chaffe – 1967. Smithsonian Contributions to Anthropology, Volume 4.
Reference 3: https://en.wikipedia.org/wiki/Seneca_language.
Reference 4: https://senecalanguage.com/.
Reference 5: https://psmv2.blogspot.com/2014/10/outdoor-options.html.
Reference 6: https://en.wikipedia.org/wiki/Antidisestablishmentarianism.
Reference 7: https://psmv4.blogspot.com/2020/10/the-power-of-word.html.
Reference 8: https://en.wikipedia.org/wiki/Polysynthetic_language.
Reference 9: https://psmv3.blogspot.com/2018/08/the-myth-of-unconscious.html.
Reference 10: Facts and figures about the Church of England – The central board of finance of the Church of England – 1962.
Reference 11: http://psmv4.blogspot.com/2020/09/an-updated-introduction-to-lws-r.html.