Tuesday, 25 August 2020

GSoC - Final report

 Phase1 -   

  • My initial focus for the project was to completely isolate two thread-stacks. This as a pre-requisite required to utilize the ARMv7 MMU to isolate blocks of memory dynamically. Through the first two weeks, I worked on providing support for dynamic memory isolation. Code sample.
  • The next part of the task involved providing proper context switch support for setting/unsetting memory attributes of the thread-stacks. This completed my first phase of GSoC.
  •   These 4 blog posts provide an explanation for my work in phase 1.
  • A complete view of my work in phase 1 can be viewed here.

Reflection on Phase 1Phase 1 started with a fair bit of skepticism from my side, as I had never ever done anything remotely close to memory isolation. This was compounded by the fact that I was unfamiliar with the RTEMS codebase. Nevertheless, my mentors gave me some valuable suggestions for making incremental progress. Once I had written some code that successfully provided basic memory isolation I felt a bit sure about the direction my project was heading into. Another major challenge in this phase was to write the context switching code for protected stacks in assembly, as I did not have a lot of experience in ARM assembly. Here, the good old GNU debugger came to my help as I checked the changes in all the register values after every instruction. This was a slow and tedious process that took a week to get the correct code for context switch even though the volume of code was not very large.


Phase 2 - 

  • The focus in phase-2 was to get an end to end working, optimized solution for strict thread-stack isolation and provide basic support for thread-stack sharing.
  • A simple test in which a thread tries to read from the stack of another thread, while it has been switched out was used to test this end to end solution.
  • This test laid bare some of the dormant problems with my implementation that had gone unnoticed. I discuss them and their solution in this post.
  •  Up until this point, I had been using separate structures for tracking protected stacks. This mechanism also used to dynamically allocate memory for these structures.  I integrated these structures in the already present Stack Control structure so that they can be allocated statically, based on the number of threads that are configured through the application. 
  •  The next part involved providing a naming mechanism for the protected thread-stacks. This was done in order to identify the memory region to be allocated from shm_open().

Reflection on Phase 2 - For the first two weeks of this phase I was relatively unproductive, as I had to appear for my end semester examinations. Once it was over I quickly started making progress towards my plans. This is when I encountered a major hurdle in my project, I was getting fatal exceptions left, right, and center in my context switch code although theoretically, my mechanism for isolating thread stacks seemed sound. I discussed this on the mailing list and got some useful advice. Although that did not solve the problem completely, I had to rethink and go through all of my assumptions from scratch to figure out exactly what was going wrong. Finally, I realized some of the stupid assumptions that were glaringly obvious. A detailed explanation of what was going wrong and my solution can be found here. On a positive note,t going through everything all over again solidified some of my fundamentals related to memory isolation and how an MMU works in general.


Phase 3
  • After completing the basic support for thread-stack sharing, my focus in this phase was to provide support to high-level POSIX APIs for thread-stack sharing. Another important goal for this phase was to optimize my code and make it merge-ready by using more of the structures that are already present in the RTEMS codebase.
  • I have discussed the rationale behind high-level design choices for stack-sharing here. The relevant code for stack-sharing can be found here.
  • The above changes had some flaws. They were not optimized enough and the mechanism for stack-naming was not integrated properly to the stack control structure. These patches moved my code a bit closer to merging but some of the issues with proper integration into the stack control structure still remained.
  • I also worked on providing a mechanism for the conditional configuration of thread-stack protection. This enables the user to enable the thread-stack protection conditionally at build time. This option is based on the new build system of RTEMS.


Reflection on Phase 3 -    By the time phase 3 started, I had managed to come up with a working stable solution for thread-stack isolation and sharing, or at least so I thought. On testing my code against test cases where thread-dispatching was being done through my solution broke. On discussing it on the mailing list, I realized the mistake and was able to resolve the issue. This phase primarily comprised of refactoring my code to make my code mergeable and shaving out the rough edges where thread-stack isolation was failing

Final work -  

Future Work There are primarily two important things that need to be taken care of before this work becomes merge ready - 
  • Currently, the mechanism for isolating thread-stacks requires setting/unsetting the memory regions corresponding to the thread-stack. This can be optimized by changing the page-table base during each context switch.
  • Handling of deletion of threads when their life-cycle finishes also need to be handled.

Steps to re-create the current work -    

  • Clone the Final_release branch from my repo.
  • Since the configuration option is based on the new build system, the standard 'make' option is not compatible with this feature. Refer to Sebastian Huber's rtems-docs repo, chapter 7, for the basic setup of the new build system.
  • The important point to keep in mind is to set the RTEMS_THREAD_STACK_PROTECTION option to True in my repo's config.ini file before the './waf configure' command.
  • Currently, thread stack isolation and sharing work only for arm/realview_pbx_a9 BSP.
  • Simulation has been done on QEMU.
  • For a demonstration of the thread-stack isolation refer to the thread_stack_protection test in the testsuite. For thread-stack sharing mechanism refer to the thread_stack_sharing test.

No comments:

Post a Comment