Coding text information on the computer
A computer is a complex device, with the help ofwhich can create, convert and store information. However, the computer does not work quite intelligibly for us - graphical, textual and numerical data are stored as arrays of binarynumbers. In this article, we'll look at how text information is coded.
What is for us a text, for a computer -sequence of symbols. Each symbol represents a specific set of zeros and ones. Under the symbols are meant not only lowercase and capital letters of the Latin alphabet, but also punctuation marks, arithmetic signs, service symbols, special symbols and even a space.
Binary coding of textual information
When you press a certain key on the internalthe controller sends an electrical signal that is converted to binary code. The code is matched to a specific character, which is displayed. To represent the Latin alphabet in digital format, an international ASCII coding system was created. It requires 1 byte for writing one character, hence the symbol consists of an eight-digit sequence of zeros and ones. The recording interval is from 00000000 to 11111111, that is, the encoding of textual information using this system allows the presentation of 256 symbols. In most cases this is enough.
ASCII is divided into two parts. The first 127 characters (from 00000000 to 01111111) are international and represent specific characters and letters of the English alphabet. The second part - the extension (from 10,000,000 to 11111111) - is intended to represent the national alphabet, the writing of which is different from Latin.
Encoding of textual information in ASCIIis built on the principle of increasing sequence, that is, the greater the number of the Latin letter, the greater the value of its ASCII code. The figures and the Russian part of the table are built on the same principle.
However, there are several more species in the worldencodings for Cyrillic letters. The most common ones are KOI-8 (8-bit encoding, used already in the 1970s on the first unified Unix OS), ISO 8859-5 (developed by the International Bureau of Standardization), CP 1251 (text information coding used inmodern Windows OS), as well as 2-byteUnicode encoding, with which you can submit 65,536 characters. Such a variety of encodings is due to the fact that they were developed at different times, for different operating systems and for various reasons. Because of this, there are often difficulties in transferring text from one medium to another - if the encodings do not match, the user will see only a set of incomprehensible icons. How can you fix this situation? In Word, for example, when you open a document, you receive a message about problems with displaying text and offers several options for transcoding.
So, the encoding and processing of textinformation in the depths of the computer - the process is rather complicated and time-consuming. All symbols of any alphabet represent only a certain sequence of digits of the binary system, one cell is one byte of information.