Tuesday, 21 April 2020

Endianness (Little Endian and Big Endian) and Memory Structure


In this article we shall understand the memory basics and endianness of an embedded system. When people say little endian or big endian then what it means and how to find the endianness of a system.
          Endianness tell about how an number is stored in the memory. The byte order of storage of a number in the memory which means whether the lower byte is stored in higher addressed memory or lower addressed memory. So in a way since it talks about which end of the number goes to which end of the memory, this feature is termed as endianness.
          Before understanding the endianness we need to understand some pre-requisite terms which are explained below:

v  Byte : Group of 8 bits forming a unit of data. We always talk of size of data in terms of bytes. For example character datatype carries 1 byte of data, an array consists of so and so many bytes. The memory size is also viewed as this many bytes .

v  WORD: A word of an embedded system is same size as its address bus length. If the underlying controller is 16-bit controller, then the word size is 2 bytes (16 bits = 2 bytes). Similarly, if the controller is 32bit then word size is 4 bytes. Note that the size of integer datatype is always equal to the word size.

v  BUS: The collection of physical wires which carry bits from one unit or module to other module within a microcontroller. In embedded system, we have serial bus and parallel bus. Serial bus carries data bit by bit in a bit stream serially as only single line is present and is usually used for transferring data between multiple chips. Parallel bus on the other hand can carry multiple bits parallelly from one unit to other unit and most often these units are with a single chip. So we can say serial bus is used for inter-chip communication and parallel bus is used for intra-chip communication. Mainly in every chip we have two types of parallel bus – Address bus and Data bus. Depending on how many bits these parallel bus carries simultaneously, the controller is called that many bit architecture or controller. For example if the address bus can carry 32 bits (contains 32 lines) parallelly, then the controller is called 32-bit microcontroller. Usually the size of address bus and data bus are same.

v  Address bus and Memory size: These parallel buses (Address and Data) carries bits from one unit to other unit within the controller. These units can be from RAM (Random access memory – volatile primary memory) to CPU or from various peripherals to CPU etc. the data bus carries the data and address bus carries the address in them as their name indicates. The size of address bus decides the maximum possible memory size. How you ask? Let me explain:
Let us say that the address bus is 2 bit size. We know that the address bus carries address of memory. So with 2 bits, what would be the minimum address value and maximum address values that can be carried? With all values of address lines being zeroes, the value will be zero (00). The maximum value is when all the address bus lines carry 1 in them, so the maximum value is 3 (11). So the available values are 4 bytes (1 byte in address 00, 1 byte in address 01, 1 byte in address 10 and 1 byte in address 11). So with address bus size as 2 bit, the maximum possible memory size would be 4 bytes. Similarly if the address bus is 16 bits, then maximum memory available would be 216 bytes = 64KiloBytes. In general if the address bus is n-bit length then maximum possible memory is 2N bytes. Remember that the primary memory and cache and secondary memory and SFRs all these come under these memory range only. By increasing the address bus and data bus size, the speed increases, memory size increases etc.

v  Memory arrangement: Even though the memory is seen in bytes and each byte has a memory address, it is arranged as word. Memory can be visualised as stack of word with each word consisting of that many bytes as denied by address bus size. Below diagrams show memory view for 16-bit controller and 32-bit controller.
16 Bit controller Memory visualization
32 bit controller memory visualization


THUMB RULE à each byte has a memory address. Address bus carries the address of first byte of the word. Data bus carries the content of the whole word and not just that byte.


Let us try to understand the 32 bit architecture shown in the diagram and other architectures are similar to understand based on this explanation.

Example memory structure - 32 bit

Remember that the address bus and data bus size will be of same length and in this case they are 32-bit long. Let us try to access the memory 0x30A04800. So the 32bit Address bus will contain 30a04800 (00110000101000000100100000000000) and the data bus will carry 0x28B345A1 (00101000101100110100010110100001) in it. You might think that Data bus must carry only 0xA1 as it’s the value contained in memory address present in address bus, but that is not true. Address bus always caries the address of first byte of the word (First byte value changes for little endian and big endian which we will learn later in this article) and the data bus caries the whole content of the word (all 4 bytes here). It is not possible to load the address value  0x30a04801 in address bus ever. The very next address of 0x30a04800 which can be loaded in address bus is 0x30a04804. Let us try to understand this with help of a c program.

In c programming language, we use pointers to play with addresses. If we have declared a pointer ptr and after initialisation if its pointing to the address 0x30A04804, then if we increment the pointer by ptr++ then ptr value will be 0x30A04808. Instead if we try to decrement the pointer by doing ptr—then ptr value will be 0x30A04800. So In Embedded C, pointer values are always increment and decrement by word-size and are multiples of word-size. If we try to force the ptr to point to intermediate address value by doing ptr = 0x30A04801 then it wouldn’t compile and throw an error. Also when you try to read the value of pointer (*ptr) then it reads the entire word and depending on declaration of ptr (datatype) it gives the value. To understand this see below c code snippet

int *ptr1;
char *ptr2;
//Assume ptr1 and ptr2 points to 0x30A04800
printf(“%x, %x”, ptr1, ptr2);

Output: 28B345A1, A1

For printing ptr1, it had read complete word and since its integer type and data size is same as word, it printed whole value of word.
For printing ptr2, it had read whole word but since ptr2 is of char type, it chopped off other bytes than LSB and printed least significant byte only which is A1.
So to conclude, the address bus always carries the address of first byte of the word and the address bus value is always a multiple of word-size. Data bus always carries whole word value pointed by address bus and depending on the datatype, this value carried in data-bus is processed and displayed or used in the program logic.

So with these basics let us try to understand about endianness. Endianness defines which end of the number is stored in lower address of the word and which end is stored in higher address of the word. So we have two types of endianness. Little endian and Big endian.


Little Endian: If the least significant byte of the number is stored in lowest address byte of the word and most significant byte is stored in highest address byte of the word, this this is called as little endian system. Let us try to store the number 0x12345678 in the little endian system and it’s a 32 bit microcontroller and the starting address is 0x80004000. So its stored as shown below:

Little Endian Example
Intel based or Infineon based controllers are little endian systems.



Big Endian: If the least significant byte of the number is stored in highest address byte of the word and most significant byte is stored in lowest address byte of the word, this this is called as big endian system. Let us try to store the number 0x12345678 in the big endian system and it’s a 32 bit microcontroller and the starting address is 0x80004000. So its stored as shown below:

Big Endian Example
Motorola based controllers like ST microelectronics, JDP etc are big endian systems.


Summary: When we say a controller is 8-bit or 16-bit or 32 bit or 64 bit controllers, we are referring to the size of address bus in the controller. If the address bus has 16 parallel lines then it can carry 16 bits parallelly  and hence called as 16 bit address bus. The size of data bus also is same as address bus in a controller. This size is called as word size. This address bus size or word size decides the maximum allowed size of memory too. For example if we say 16-bit microcontroller then it means address bus and data bus has 16 lines and can carry 16 bits simultaneously. Word size is 16 bits that is 2 bytes. And integer data type also is 2 bytes. The pointer needs 2 bytes to store and hence the memory address is 2 byte length. The maximum allowed memory size is 216 bytes (65535 bytes or 64KB).  
          If the least significant byte of a number is stored in lower address of a word in memory then its called little endian system. On the other hand if most significant byte is stored in lower address of the word then its big endian system.  


1 comment:

  1. Hi Sir,
    Thanks for the post
    Could you please upload video of this post in YouTube if available

    ReplyDelete