Tuesday, 25 August 2020

GSoC - Final report

 Phase1 -   

  • My initial focus for the project was to completely isolate two thread-stacks. This as a pre-requisite required to utilize the ARMv7 MMU to isolate blocks of memory dynamically. Through the first two weeks, I worked on providing support for dynamic memory isolation. Code sample.
  • The next part of the task involved providing proper context switch support for setting/unsetting memory attributes of the thread-stacks. This completed my first phase of GSoC.
  •   These 4 blog posts provide an explanation for my work in phase 1.
  • A complete view of my work in phase 1 can be viewed here.

Reflection on Phase 1Phase 1 started with a fair bit of skepticism from my side, as I had never ever done anything remotely close to memory isolation. This was compounded by the fact that I was unfamiliar with the RTEMS codebase. Nevertheless, my mentors gave me some valuable suggestions for making incremental progress. Once I had written some code that successfully provided basic memory isolation I felt a bit sure about the direction my project was heading into. Another major challenge in this phase was to write the context switching code for protected stacks in assembly, as I did not have a lot of experience in ARM assembly. Here, the good old GNU debugger came to my help as I checked the changes in all the register values after every instruction. This was a slow and tedious process that took a week to get the correct code for context switch even though the volume of code was not very large.


Phase 2 - 

  • The focus in phase-2 was to get an end to end working, optimized solution for strict thread-stack isolation and provide basic support for thread-stack sharing.
  • A simple test in which a thread tries to read from the stack of another thread, while it has been switched out was used to test this end to end solution.
  • This test laid bare some of the dormant problems with my implementation that had gone unnoticed. I discuss them and their solution in this post.
  •  Up until this point, I had been using separate structures for tracking protected stacks. This mechanism also used to dynamically allocate memory for these structures.  I integrated these structures in the already present Stack Control structure so that they can be allocated statically, based on the number of threads that are configured through the application. 
  •  The next part involved providing a naming mechanism for the protected thread-stacks. This was done in order to identify the memory region to be allocated from shm_open().

Reflection on Phase 2 - For the first two weeks of this phase I was relatively unproductive, as I had to appear for my end semester examinations. Once it was over I quickly started making progress towards my plans. This is when I encountered a major hurdle in my project, I was getting fatal exceptions left, right, and center in my context switch code although theoretically, my mechanism for isolating thread stacks seemed sound. I discussed this on the mailing list and got some useful advice. Although that did not solve the problem completely, I had to rethink and go through all of my assumptions from scratch to figure out exactly what was going wrong. Finally, I realized some of the stupid assumptions that were glaringly obvious. A detailed explanation of what was going wrong and my solution can be found here. On a positive note,t going through everything all over again solidified some of my fundamentals related to memory isolation and how an MMU works in general.


Phase 3
  • After completing the basic support for thread-stack sharing, my focus in this phase was to provide support to high-level POSIX APIs for thread-stack sharing. Another important goal for this phase was to optimize my code and make it merge-ready by using more of the structures that are already present in the RTEMS codebase.
  • I have discussed the rationale behind high-level design choices for stack-sharing here. The relevant code for stack-sharing can be found here.
  • The above changes had some flaws. They were not optimized enough and the mechanism for stack-naming was not integrated properly to the stack control structure. These patches moved my code a bit closer to merging but some of the issues with proper integration into the stack control structure still remained.
  • I also worked on providing a mechanism for the conditional configuration of thread-stack protection. This enables the user to enable the thread-stack protection conditionally at build time. This option is based on the new build system of RTEMS.


Reflection on Phase 3 -    By the time phase 3 started, I had managed to come up with a working stable solution for thread-stack isolation and sharing, or at least so I thought. On testing my code against test cases where thread-dispatching was being done through my solution broke. On discussing it on the mailing list, I realized the mistake and was able to resolve the issue. This phase primarily comprised of refactoring my code to make my code mergeable and shaving out the rough edges where thread-stack isolation was failing

Final work -  

Future Work There are primarily two important things that need to be taken care of before this work becomes merge ready - 
  • Currently, the mechanism for isolating thread-stacks requires setting/unsetting the memory regions corresponding to the thread-stack. This can be optimized by changing the page-table base during each context switch.
  • Handling of deletion of threads when their life-cycle finishes also need to be handled.

Steps to re-create the current work -    

  • Clone the Final_release branch from my repo.
  • Since the configuration option is based on the new build system, the standard 'make' option is not compatible with this feature. Refer to Sebastian Huber's rtems-docs repo, chapter 7, for the basic setup of the new build system.
  • The important point to keep in mind is to set the RTEMS_THREAD_STACK_PROTECTION option to True in my repo's config.ini file before the './waf configure' command.
  • Currently, thread stack isolation and sharing work only for arm/realview_pbx_a9 BSP.
  • Simulation has been done on QEMU.
  • For a demonstration of the thread-stack isolation refer to the thread_stack_protection test in the testsuite. For thread-stack sharing mechanism refer to the thread_stack_sharing test.

Saturday, 22 August 2020

High level design and implementation of thread-stack sharing

In this post, we will be discussing the high-level design for sharing thread stacks. Our focus would be to make the design as much POSIX compliant as possible. But first - 

Why do we need to share thread stacks?

There are certain operations in RTEMS in which a thread writes/reads to/from the stack of another thread. This includes IPC mechanisms such as message queues, in fact, all blocking reads ( sockets, files, etc.) read/write to the stack of a different thread. Now if we have completely isolated thread stacks from each other, these valid operations will give fatal exceptions whenever they read/write to the stack of a different thread. Hence we need to share thread stacks for enabling these operations.

The mechanism for sharing thread-stacks - In the last two posts, we discussed the strict-isolation of thread-stacks. We saw that when a thread is executing, it only has access to its stack and the global data. This is made possible by unsetting (set to NO-ACCESS) the memory attributes of the previous thread during a context-switch.

Now if we want our target thread to have access to the stack of a given thread, we need to set the memory entries of the thread-stack we want to share in the 'context' of the target thread. There are a couple of important things to consider here - 

  1. The thread-stack that will be shared may have memory access permission different than its intrinsic permission, i.e. if the thread-stack has R/W permission in its 'context' it is possible that its access permission while sharing maybe Read-only.
  2. We need to keep track of all the memory regions with a thread along with their access permission. This is important because we need to set/unset all these memory regions during each context switch/restoration( set with proper access permission ).  
Determining a POSIX compliant way of sharing thread-stacks - Since sharing stacks, at its core, is a mapping operation the obvious call for sharing stacks is mmap(), the problem is, mmap usually maps a file to the address space of the currently executing process, but in our case, we need to map a memory region to a thread of our choice. To do this, we need to tailor our mmap operation around different calls to fulfill our needs. This can be achieved by the following sequence of calls -
  1. Get the file descriptor of the memory to be shared by opening a shared memory object through shm_open. Here we provide the access permission of the memory region to be shared. We also provide a fixed pattern of naming to the object (More on this in the next section).
  2. Make a call to ftruncate that truncates the file size to the size of the stack and so that the shared memory object handler points to the stack address.
  3. Now we share this file to the target thread by making a call to mmap(). Here it is important to understand the various parameters we need to pass to mmap() for a successful mmap operation. This call is usually defined as mmap( void* addr, size_t length, int prot, in flags, int fd, off_t offset ). For our operations we need to do the following - 
                           - addr - We pass the address of the target thread stack to indicate the thread with which we want to share the memory region with,i.e suppose we want to share stack space of T2 thread with that of the T1 thread we pass the address of T1 thread.
                            - length- This is the stack size of the sharing thread.
                            - prot - This is the memory access attribute of the region. We have four options -                             PROT_EXEC Pages may be executed
                     PROT_READ Pages may be read.
                     PROT_WRITE Pages may be written
                     PROT_NONE Pages may not be accessed.
                            - flags -  For stack sharing operation we necessarily need to provide the MAP_SHARED option.
                             - fd - This will be the file descriptor to the shared memory object we discussed above.
                             - offset - Since we want to share the complete stack space we keep the offset to zero.

Application requirements for sharing thread stacks-

  The following are some of the requirements that an application writer has to follow for sharing thread stacks - 

  1.  Naming for shared memory objects is done in the application and the name follows a fixed naming pattern ( "/taskfs/" ), this is used to differentiate between a normal mmap operation and a stack sharing operation.
  2. We need to explicitly allocate stack memory from the application for stack sharing, and then set through pthread_attr_setstack*().
  3. This one is a possible improvement that has not been integrated yet-  Any application has to specify a series of repetitive steps (shm_open, ftruncate, mmap) for sharing a particular thread-stack. Maybe this can be wrapped under a function ((rtems_share_stack() ?) ) and we only make a call to that function every time we have to share a thread stack.
  4. For an example of how this is done refer to this test application.

Wednesday, 19 August 2020

Thread-stack isolation-v2

 In the previous post, we discussed a primitive mechanism for isolating thread-stacks. The discussed mechanism has some inherent flaws which will be discussed along with its solution in this post. 

Broadly there are two flaws with the previous implementation - 

Memory entries being set for 1Mb sections -  The ARMv7 MMU implementation for changing the memory entries is defined for 1Mb sections, this causes issues. Suppose we have two thread stacks T1 and T2 if the application writer does not explicitly state the stack size, RTEMS allocates 8K bytes to a stack. Now, on switching from T1 to T2 we set the memory entries of T1 and unset of that of T2, the problem is, we are actually unsetting memory attributes of the entire 1Mb section which may have global R/W data that is used by T2. This will cause unnecessary fatal exceptions whenever we try to access global data from T2.

Solution  -  The solution to the problem is pretty simple, but as we will see the implementation poses some subtle problems. We should set/unset the memory entries for only those regions that contain the thread stack, i.e. if we have stacks of size 8K then we should set/unset memory entires of these regions only. This requires finer grain control, we have to have multilevel (2-levels for 4K pages) page tables. RTEMS, in fact, provides support for 2 level page tables. 
The problem lies in the fact that for Xilinx-zynq BSP, the translation table base is set at 0x100000 by the linker script and extends up to 0x104000 for section-based pages (16K in size).  Although for small pages it will extend up to 0x504000 (4.16Mb in size) this will possibly conflict with other data regions(.txt, .bss, etc.) that are placed in this address space and setting up of translation table for smaller pages will fail. This is a BSP specific problem and depends on how the linker script sets-up the address space of a particular BSP. We will thus have to change the linker script to place the translation table entries in an address space where it does not cause conflict with other memory regions. We actually can take help( switch to ?)  from the realview_pbx_a9 BSP which already supports 4K pages to modify the linker script according to our needs. Here is a snippet - 


Tailoring our linker script according to the above snippet solves our problem and now we can set up translation tables for 4K pages. Now we can set/unset memory entries for our thread stacks without worrying about other memory regions, or maybe not 😏?

Allocated stacks are not page-aligned - As discussed in the previous post we use a custom stack allocator, that is defined from the application, to allocate thread stacks from the workspace and set the memory entries of the stack. The stacks allocated from the workspace are not page-aligned, where we consider 4K pages. In practice, this means that the stack address is, for example, 0xfbf9b70 instead of 0xfbf90000. How is this a problem for us?

When we set the memory entries for 4K pages the entries are set per page, i.e we have E1 entry for 0xfbf9000-..a000 and E2 entry for 0xfbfa000-..b000. Now when we get stack address from the workspace it is possible that we have stack S1 that ranges from 0xfbf9b70 to 0xfbfbb70 (8K size) and S2 ranges from 0xfbf7b60 to 0xfbf9b60. So when we unset the memory entries of S2 (which begins at 0xfbf9b70) during context switch and set the entries of S1( which ends at 0xfbf9b60) we end up setting the memory entries for the entire 0xfbfa000-..b000 (as entries are set per page). This leaves a part of the stack S1 still mapped in and we do not achieve perfect stack isolation.

Solution - Since the memory entries are set per-page, if we allocate page-aligned stacks we will be able to perfectly set/unset memory entries of only the required region. In RTEMS we can allocate byte aligned memory using  Heap_Allocate_aligned_with_boundary(). We set the alignment to 4096 as we want 4K aligned address. Note that this allocation is done in the custom stack allocator.

Sunday, 28 June 2020

Thread-stack isolation on RTEMS

After isolating memory blocks on RTEMS with ARMv7-A MMU, it is time we isolate thread stacks.
But, before we do that, there are some concepts about threads on RTEMS that we should understand.

  • Stack Allocation -   In RTEMS, the stack allocation mechanism is user-configurable, this means that the user can have a custom mechanism for allocating stacks for their BSP. This is done by defining CONFIGURE_TASK_ALLOCATOR_INIT with the allocation function in the application code. 
       
  • Context Initialization -  Each thread has its own set of registers( stack pointer, program counter) and relevant attributes ( thread id ) for its execution context that need to be assigned to it during thread initialization. In RTEMS, context initialization is done through cpu-specific _CPU_Context_Initialize() function. The Context_Control structure stores the thread register and related attributes, it is initialized by a call to _CPU_Context_Iniitalize().
Context Control Structure

Context_Control structure


  • Context switch and restoration - In any context switch procedure we need to save the register state of the thread including the stack pointer and the program counter of the executing thread and then switch to the heir thread by loading the program counter with the address of the heir thread stack pointer.                                                                          In RTEMS this is done through CPU-specific assembly code. During context-switch, we save the registers of the executing thread, load the register values of the heir thread and switch to  _Thread_Do_Dispatch() by loading the program counter with the pointer to the handler function.  For restoration, we simply restore the register details by loading them from the 'R0' register and similarly branch to _Thread_Handler() function.

                                     
Context-Switching Code

Isolating Thread Stacks - 

  • Setting memory attributes dynamically -  We have already isolated memory blocks in the previous post by setting different access permission to different memory regions. Now we need to provide a high-level mechanism to isolate these blocks so that the same framework can be used across all the architectures. The cpukit/include/rtems/score/memorymanagement.h provides the set of APIs that can be defined for the target architecture for setting memory attributes of address space. Here, we will see its implementation for ARMv7 MMU
    • Flag Translation -  At the high-level implementation we need to pass generic memory attributes to memory_attributes_set() function. These attributes are translated to the architecture-specific implementation for the defined BSP architecture i.e. Suppose we pass the 'READ/WRITE' flag to the function, at the architecture level it needs to translate to the bit 'combination and position' of the access-permission bit in the control register. This is why we map the high-level flags to the low-level implementation by defining the memory_translate_flags() for the target architecture. You can check the implementation for ARM MMU here
  • Allocating protected stacks -  As we had discussed above,  in RTEMS, we can have a custom stack allocation mechanism. This will be useful to us, as we can utilize this feature to allocate stack with specified memory permissions. Another utility of this feature, as we will see further ahead in the discussion, is that we can register the allocated stack to a chain (doubly linked-list) for tracking them effectively. We define our custom stack allocation mechanism in bsp/shrared/start/stackalloc.c. After we have defined this, the user should always configure the application as discussed earlier to allocate protected stacks.
  • Tracking protected stacks - We must keep track of all the allocated stacks and their memory access attributes because,  during context switch, we need to unset the memory attribute of the current stack and set the memory attributes of the heir stacks, now this can be done only when we have a track of the thread stack attribute that is currently executing and the heir thread stack.
    • Protected stack attributes - Every allocated stack has some attributes that we need to track for setting memory attributes for the stack space. These include the stack size, stack address, access flags, execution status, and in the case of stack-sharing, shared stack attributes. The stack-management APIs are declared in cpukit/include/rtems/score/stackmanagement.h. The stack management structure looks like this - 
    • Adding allocated stacks to a chain - A very simple way of tracking stacks is by adding each allocated stack to a linked list. In RTEMS, we already have chains that are implemented as a doubly-linked list. We can set the current_stack attribute to 'true' for the most recently allocated stack and set all the other nodes to 'false' and append the stack attribute structure to the list. This way we can keep track of all the allocated stacks and their allocation status. The implementation of adding stack attributes to a chain can be found here.
  • Context initialization of protected stacks - We must register/initialize the stack attributes of a particular to its Context_Control structure because the members of this structure are saved and restored during a context switch. We call prot_stack_context_initialize() from _CPU_Context_Iniitalize() register the stack attributes to the control structure.

  • Context switching of protected stacks - For switching context of protected stacks, we follow the pre-existing model in RTEMS. We save the relevant registers and attributes and call the Thread_Do_Dispatch() function by loading the program counter with the address of the function. The only difference is that,  for protected stacks, we call the prot_stack_context_switch function,  which unsets the current memory attributes, from the assembly code by passing the stack attribute structure as a parameter. We load this parameter to the 'R0' register through 'LDR' instruction by specifying the proper offset into the context control structure.

  • Context restoration - We follow the same approach,  as that with switching, with context restoration.  We restore the relevant registers and attributes and call the Thread_Handler() function by loading the program counter with the address of the function. The difference here is that we call prot_stack_context_restore that sets the memory attributes of the thread stack and marks the current_stack attribute as 'true'. We pass the stack attribute to the function by loading this parameter to the 'R0' register through 'LDR' instruction by specifying the proper offset into the context control structure.


This completes our thread isolation implementation. Clone and build this repo for trying out the implementation with various cases where you try to access the stack address of a dormant thread from an executing thread. Be ready to have the OS throw exceptions your way!


Note - This implementation is only tested for POSIX threads, classical RTEMS threads have not yet been tested and the implementation may leak memory when trying to isolate them.

Saturday, 27 June 2020

Isolating two blocks of memory and living to tell the tale!

In the last post, we discussed some of the high-level ideas of thread-stack protection, and details of the implementation using an MMU. This time, we will be isolating blocks of memory using MMU on RTEMS. But first things first 

What do we mean by isolating two blocks of memory?

 Suppose we have two blocks of memory A and B. We want to have both read and write operations on A and read-only operations for B. Now if we can implement a system using an MMU where we can change the values in block A but as soon as we try to write to block B, we get an exception, we have isolated two blocks of memory in the sense that we would have assigned different access permission to these blocks and an operation that works on block A will raise an exception for block B.

What do we plan to achieve by isolating memory blocks?

Isolating memory blocks will be one of the fundamental stepping stones on the way to thread-stack isolation. If we think about it, thread-stack protection in its most watered-down form is isolating memory blocks(The memory blocks, in this case, being the stack address space) from each other. 
Isolating these memory blocks will give us a framework upon which we can implement other complexities of thread-stack isolation(stack allocation, context-switching, etc.)


Now, with that out of the way, we should focus on the details of the implementation and start getting our hands dirty!

  • Choice of processor architecture - 
We will implement our memory isolation on ARM-based MMU, in particular the ARMv7-A. There are two reasons for this choice. Some of the famous boards (Beaglebones, Raspberrypi, Xilinx-Zynq) are based on this processor. More importantly, RTEMS already has support for initialization, page table, and page entry setup for ARMv7-A MMU. This means we don't have to code everything from scratch and simply utilize the existing support to isolate the memory blocks. 
  • Paging levels - 
At this point, we need to understand the concept of levels of translation that a virtual memory address goes through so that it represents an address in the physical memory.
By setting up page table entries appropriately, we can have memory regions of a 'particular type' that range in size from 16 MB to 4 KB. So why do we need memory regions of such varying sizes?
There are some cases where we need a large chunk of memory (eg. heap region) to have the same memory access permissions and at other times we need fine control over small memory regions (eg. thread stacks).
The sections and supersections need only one level of paging, whereas the small and large page addresses are translated using two levels of paging. 



  • ARMv7-A MMU configuration for memory isolation-
Let us first understand what all configurations we need to set up for an MMU to isolate memory blocks - 
    • Initialising the MMU( Duh! )
    • Setting up the page table entries for accessing physical memory
    • Assigning proper access permissions in the page table entries for memory operations

A key player in controlling the MMU is the CP15 register-set, the coprocessor15 controls cache configuration, and management, system performance, and more importantly the memory management unit. All the registers that control MMU configuration belong to the CP15 register set.
In the previous post, we had gained a general idea of accessing memory and setting up their access permission using access flags in an MMU. Now let us look at the ARMv7-A specific details of performing those operations.
    • Page table base setup -   The page table base address in the v7-A MMU is stored in either the TTBR( Translation Table Base Register )0 or TTBR1 register. The choice is made setting the bits[0:2] of the  TTBCR ( TTB Control Register ). As you may have observed we can set the bits in various permutations to select the registers in different manners, but that is a topic for another day😉. Right now, we will set the bits to 0 which means TTBR0 has the address of the page table base.  
    • Page table entries setup -  For filling up the entries in a page table, we first have to determine the page size for the memory region. In v7-A architecture, depending upon the size of the region that is addressed by a single page table we have supersections (16 MB regions), sections (1 MB), large pages (64 KB), and small pages (4 KB).                     We will take 1 MB regions as they are easier to implement than small/large page tables but cannot take up all the address space like the supersections. The page table descriptor or entry format for a v7-A MMU is -

 

      • For our aim to isolate we need to focus only on the 'Section base address' bits, AP (Access-permission) bits, and bits[2:0]. As we discussed above, we need only a single level translation for sections.                                      
      • We set bits[2:0] as [10] for section translation, AP bits are set to 01 for region A (read/write permission) and 11 for region B (read-only permission).                       
      •  Bits[31:20]  of the page table entry provides the section base address and the offset is provided by the first 20 bits of the VA. The translation flow for a section looks something like this - 
     


  • Memory isolation on RTEMS -                         
We need to understand some important ARM v7-A MMU implementation related concepts in RTEMS to isolate memory blocks - 

  • MMU initialization - The MMU initialization code is almost the same for all ARMv7-A supporting BSPs in RTEMS. The bsps/arm/include/bsp/arm-cp15-start.h has  arm_cp15_start_setup_mmu_and_cache() and arm_cp15_start_setup_translation_table_and_enable_mmu_and_cache() which start the mmu and setup the page tables for the specified memory regions.                                               
  •  Specifying memory regions - Each BSP has its own specified address space where various data (.bss, .txt, etc.) should be placed. These regions also need to have their own access permissions this is specified by the mmu_config_table[]. In our case, we will be using the zynq BSP

 The ARMV7_CP15_START_DEFAULT_SECTIONS has the address space details of the default sections like .bss, .txt, rodata.

This table is passed into the arm_cp15_start_setup_translation_table_and_enable_mmu_and_cache() which sets up the translation table entry for the specified memory regions.
  • Linker defines - Now we have defined the memory attributes of various address spaces, but how do we actually place the default sections into the defined memory regions?  This is done with the help of the linker defines and the linker scripts. The linker define changes the defined memory regions into linker symbols
       
  The linker script places these symbols into the specified memory regions.



                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
         
Now we just have to allocate two memory regions, place them in the mmu_config_table with the specified permission and pass them into  arm_cp15_start_setup_translation_table_and_enable_mmu_and_cache(). When we try to write to memory region B with the above configuration, we get this - 

We first set the permission for region A as read/write and then read-only for region B. We successfully complete write operation on region B, but as soon as we try to write to region B, the OS throws an exception .

Sunday, 17 May 2020

A pre-requiste for the stuff to come

In this series of blog posts, we will try to implement thread-stack protection for RTEMS. But first,

What is thread-stack protection?

Well, the name says it all 😉. Threads, in any OS, are the atomic unit of CPU utilization that has a program-counter, register sets, and a stack that can be thought of as an address space that is private to a particular thread.  It is this private address space that we want to protect from accidental or malicious data read/writes. Which leads us to another question - 

How do we protect thread stacks?


This is a two-part answer, there is the concept, the abstract idea, of thread-stack protection, and the actual implementation - 
  • It is important to understand as to how can a thread's stack be corrupted. There are multiple ways to corrupt a thread's stack(this is an interesting and informative read of one of the cases),  our focus will be on providing protection against the case when a second thread somehow gains access to address space of our target thread. The idea is to somehow have a setting where for the executing thread the address space of all the other thread either becomes 'invisible' or if it somehow tries to access the address space the OS throws an error. Now how do we implement such a mechanism?
  • Enter the MMU (Memory management unit), as the name suggests(I know I am getting repetitive 😛) is a piece of hardware present on most of the modern processors that can take care of some of the more complex ways of accessing memories for us. This brings us very nicely to our next question -
How do we implement thread stack protection using MMU?


We will be taking the example of the ARMv7-A processor for this demonstration, this processor is on some of the famous boards (Raspberrypi, BeagleBones, etc.) added to that is the fact that RTEMS already has a very mature MMU support for this processor.

For this blog post, I will omit some of the complex details(That will be covered in the next post).
Right now we have to understand some important concepts related to memory operation:-
       
  1. Memory access permissions:- One of the best features of an MMU is the fact that we can decide what kind of operation can be done on a certain memory region, for example, we can have set a 2Kb memory region to permit only read operations and the immediate next 2Kb to support read and write operation and the next after that with no-access. We just have to set a permutation of two bits for the memory region of our choice in an already specified register in the MMU to specify the type of operation that can be done on that memory region, anything other than that and you will find your OS throwing exceptions at you, and probably for a good cause!! But another question arises it's all well and good that we can specify access permission to memory regions, but how do we select these memory regions!?         
  2. Virtual Address- Virtual addresses are the memory addresses that the CPU generates, these addresses are then translated by the MMU to the actual physical address of the memory on our system. For this project, we will have 1:1 VA-PA translation i.e. we will set our address translation scheme in such a way that the physical address and the virtual address are the same. Wait, we can control address translation?                                                                                       
  3.  Pages and Page Table:-  Suppose we have the case we took in point 1, with small pieces of memory requiring different types of access permissions. If we were to set memory access bits for read-only operations it will apply to the complete address space and we won't have any read-write memory region left! So what do we do?                                                                                        A simple solution is to change memory permission for the entire address space at each context switch. But this will lead to a lot of problems, for starters we will be losing a lot of memory to read-only operations for the sake of 2KiB worth of memory and then there is the problem of the .txt,.bss and other memory regions that need to have separate access permissions for every context.                                                                                                                                 A more efficient solution is to divide the address space into regions of our requirement and set the memory access permission for these regions independent of other regions, this way we won't be wasting memory and can have separate regions with different access permission for each context. These regions are more commonly known as pages. But the question still remains,  how does the MMU address these pages and how do we set memory permission for a particular page. Now this will require a long explanation and a simple image can do it all pretty concisely-       

Page Table organization and address translation
                                                                                          



It will be simpler if we don't focus on the address bits and think about the concept involved.
Basically, each page table contains certain entries known as page entries that contain a part of the physical address or the address of the next page table, in both cases, it is the base address, as well as access permission bits. The virtual address value is used for indexing by adding it with the page table entry and we get the physical address. For example, suppose we have 0x223 as the page table entry, the last two bits are 0b11 and are used for access permission, so we are left with 0x220 now we combine it with a part of the virtual address let us say 0x15 making the value 0x22015 and voila! We have the actual physical address.                                                                             
 In this manner,  by setting up page tables and page table entries we can set memory permission for selected regions of memory. Now, this is a very crude example with a lot of details missing and not everything may just be clear yet, it will be in the next post when we go into the gory details of the implementation and try to isolate two blocks of memory.

For now,  try to piece everything together and wait for the next post!!