Sunday, 29 March 2020

A very short history of computing

This by way of a preamble to a post to come about the much discussed question of cookies, prompted by the article at reference 2. I proceed by way of a series of scenarios, very roughly in chronological order of appearance, starting in the mid 1970’s, about the time I arrived in computing, not then called IT. A history which focusses on the relations between individuals, now often at home, and large, central computers in data centres – and concludes that we do indeed need cookies – or some equivalent device.

A time when most computers were in governments and large corporations and spent a lot of their time processing administrative records of one sort or another, for example birth registrations, bank accounts and insurance records. Records which were often created by data entry clerks, fast and accurate, but not needing to know much about what they were doing otherwise. Since which time, the pen-pushing, knowledgeable clerk, once running in large herds like the buffalo, has become an endangered species.

A half century during which the price, power and performance of central computers, storage, communications, terminals and personal computers has changed beyond all recognition. With the shifting relativities of these components shifting the balance of power backwards and forwards, rather as changes in the relative prices of steel and concrete shift the designs of large buildings.

During which time the complexity of the software running on this equipment has also increased beyond all recognition. No one person can even sketch out the whole of it any more – and is only kept manageable at all by subcontracting lots of work out to standardised components.

Figure 1
Our first scenario, left in the figure above, is a mainframe computer, perhaps something like an ICL 1904 running George III. Such a computer is capable of running quite a lot of programs at the same time. But very roughly speaking, these programs have their own code and data areas, fixed for the duration of the run, and do not talk to each other. They are all running independently – although once a program has stopped and has released its data, that data might be re-used by some other program. All the knowledge and all the work is in the one place.

Figure 2
This is illustrated in the figure above. Our program and its data have been loaded into segment [α, β] of memory in the computer, which might, in those days, have run to 100,000 of our bytes. Sometimes, this would have been a large proportion of the memory available, sometimes a small proportion. But whatever it was, it was fixed for the duration of the run. The program ran by executing one instruction after another, keeping its place by means of a pointer, called a cursor, until it executed a stop instruction. Which may have not happened, at least not in reasonable time, in which case the operator had to intervene, from on high, as it were.

In this example, the cursor is at A and the instruction is to use the data at locations B, C and D to compute something to be put in location E. By default, the cursor then advances one step to F, but exceptionally the instruction may be to set the cursor to somewhere else, say to G, usually instead of writing some data to location E.

But whatever the case, the program’s cursor must stay within the brown area and the program may only address data within the blue area. It is as if the rest of the computer, not to mention the rest of the world, did not exist. Although, that said, things did go wrong from time to time.

Our second scenario, right in Figure 1 above, marks the arrival of the database, a complicated program in its own right (program 2), which enables other programs (in the example above, programs 1 and 3) to share data, to use data from the same data store, at the same time. A data store which might, for example, contain the all the pay records of the 80,000 or so John Lewis partners. This program will include mechanisms to stop a second program interfering with data being used by a first. It will also include various mechanisms to help with data quality, security and integrity. In due course, Microsoft’s SQL Server became a popular (and cheap) database program.

The knowledge of and work on our payroll is still in one place, but it has been divided between two programs, with program 1 sometimes described as front end processing and with program 2 sometimes described as back end processing.

Figure 3
Our third scenario, left in the figure above, marks the arrival of teleprocessing, the connection of possibly large numbers of display screens – dumb terminals in the jargon of the time – to a central program – sometimes called a TP monitor. These dumb terminals might be in the same building as the central computer but might be some way away, sometimes in another country. The central program is up and running more or less all the time, while the dumb terminals connect to it as and when.

The activity and status of each dumb terminal is held at the centre in what is here called local data; one lot of local data for each dumb terminal. Knowledge and work is still very largely with the centre, even though we now have users on the periphery.

Figure 4
The display of the dumb terminal would often be green on black, perhaps something like the figure above, although this particular example looks to be taken from a stand alone personal computer, with which we are not concerned here. The display was generally character based with no fancy fonts and no pictures, arranged in lines and fields, for example the name and department of the person concerned. The end user would interact with the central program by keying stuff into those fields. Plus one or more special fields in which the user could enter instructions about what the central computer was to do next.

Figure 5
Our fourth scenario, on the right in figure 3, marks the arrival of the personal computer, a terminal which was not dumb at all and which could do rather more than display simple data from a central program. Part of this was to do with efficiency, at a time when communications over a distance were more difficult and expensive than they are now. An intelligent terminal could take on a lot of routine formatting and checking of data coming in from the user that would otherwise need to be going up and down the line. And burning up the then scarce central resources. Part of this was expressed by downloading special programs called clients to the personal computer, clients whose job it was to drive the interaction between the user and his personal computer and between the personal computer and the mother ship.

The job of keeping track of what the user was up to at any particular moment in time was shared between the computer at the centre and the local computer. But the overall effect was a transfer of both knowledge and work from the centre to the periphery.

In the beginning, a lot of these personal computers were built by IBM and the displays were very rudimentary compared with what can be done now, with Figure 5 above giving something of the idea. While now, most of these local computers run under Microsoft’s Windows software, the prototype for which was paid for by said IBM and which provides a well-known and stable environment within which others can run their own programs. A sample of which is provided by Figure 7 below.

Figure 6
Our fifth scenario marks the arrival of the Internet and broadband connections all over the world, at least the developed part of the world, with huge numbers of personal computers connected to central computers, from both homes and businesses.

A context in which it became difficult for the centre to keep track of what the periphery was up to, a difficulty which was resolved by the invention of cookies which were kept in-between times on the personal computers, out on the periphery, but sent along to the central computer for the duration of each transaction with it. Cookies which are analogous to the state variables and control variables which one might have in a standalone program in Visual Basic. One example of the sort of thing that these cookies do is to hold one’s basket while one is shopping. Another is to hold one’s credentials to access some central computer, to reduce the number of times you have to present them. To which end, my understanding is that the location and content of cookies are deliberately obscured to make it harder, certainly for the common or garden user, to spoof them.

Another innovation was the use of HTML messages – plus other widgets and wheezes – to describe the pages to be displayed on personal computers, including here the spaces, the fields, in which users might be asked to enter data. This meant that there could be a rich user experience without the personal computers needing to know anything much about it or needing anything other than general purpose tools, tools which often come as part of Windows and Office. No need to download and manage clients, which by the end of the twentieth century had become a major drain on the resources of IT departments.

Figure 7
Figure 8
Note that the unit of display on the personal computer is the page, which may be far too big for the screen in question. The solution here is, as necessary, to add horizontal and vertical scroll bars, although this becomes clumsy with the very small screens on hand held computers, aka mobile phones. Furthermore, individual elements of the page may have their own scroll bars. So, in the example above, we have a result from the Google search engine, run from within Microsoft’s Edge: the page returned has a vertical scroll bar and the subordinate element right has another. Not sure why Titian’s Venus of Urbino gets into the results of an inquiry about his Bacchus and Ariadne – but there is fun to be had by twisting one’s head to the right and looking at it sideways.

The box diagram which follows (as Figure 8) is not very accurate, but it does try to suggest the complexity of what is happening on the personal computer, with many layers of software, with all kinds of complicated connections. Furthermore, the distinction between data and code has become very blurred and the code has become very dynamic, a far cry from the relatively tractable world of scenario 1 – although it seemed quite complicated enough at the time.

In principle, Microsoft have access to all kinds of more or less private information, suggested at the bottom of the red box. Who knows what they do with it and who they share with it. In any event, knowledge (and power) has been transferred from the periphery back to the centre, while leaving the periphery to do a lot of the work. The grunt work.

Note also that the design and construction of these complex systems is largely incremental. Which means that a lot of the design and a lot of the actual computer code is quite old. With some of this last being old to the point where maintenance is very difficult, if possible at all. Witness the peak in demand for heritage IT contractors at the time of the scare about the millennium bug, that is to say back in 1998 and 1999. Also Lieberman on the evolution of the head, last noticed at reference 1.

Figure 9
Our sixth scenario, the last, takes things just a little further. To a world where the Internet is connected to big computers, little computers, personal computers, mobile phones and all kinds of other devices – all talking to each other in a more or less controlled way, in which any one transaction can be thought of as being along one of the lines suggested above.

Note that a large computer talking to another large computer is not that different, at the level at which are talking, to a large computer talking to a small computer. It is just that there is more at stake and more is done to protect the integrity of the transaction involved.

A still shorter history

In the beginning, a program ran on just one computer, it did not talk to other computers and there was a clear separation between data and code, with each occupying a fixed segment of computer memory during program execution and with the code segment only being allowed to address the data segment. It was very clear where the code and data were, what they were doing and who owned them. This was suggested by Figure 2 above.

Now, while the principles are still the same, the facts on the ground have become hugely more complicated. The distinction between data and code has been blurred. The concepts of location and ownership have been blurred. One computer needs to know about, to keep track of what is happening on another. And while one might fret about what the cookies that follow get up to, what they are used for, they are very necessary. The mother ship does need to know what all those personal computers out there on its periphery are up to if it is to play its proper part, its intended part.

At the moment this is done with cookies, sufficiently complicated to be subject to their own standards. This is addressed at reference 3. It may well be that someone has cooked up, will cook up some other way of doing the same job – but it seems that any such other way is going to raise much the same issues.

In two sentences, the IT which joins us together has become enormously complicated and cookies are an essential part of that. The trick is to make sure that they are used, rather than abused.

References

Reference 1: https://psmv4.blogspot.com/2020/02/more-pivot-table.html

Reference 2: Google plan to lock down user data draws fire from advertisers: publishers warn ‘self-serving’ move could entrench search group’s dominance – Madhumita Murgia and Alex Barker – 2019. That is to say somewhere in the FT world on November 14th, 2019.

Reference 3: HTTP Cookies: Standards, Privacy, and Politics – David M. Kristol – 2001. A little long in the tooth, but there is a lot here from one of the inventors of the cookies we now have.

No comments:

Post a Comment