A dedicated kernel for multi-threading applications.

Tuesday, March 12, 2019

Toro supports VirtIO block disks and Berkeley sockets

Hello folks! I am glad to announce that Toro supports virtio block devices (issue #158) and berkeley sockets (issue #268). I worked on these issues the last two months and they are already in master. To tests these new features, I modified the StaticWebServer example. You can see that the ata driver has been replaced by the virtio disk driver. Also, you can see that the non-blocking webservice has been replaced for the classical berkeley sockets.

Matias 

Thursday, February 07, 2019

Roadmap for 2019

Hello folks! I would like to share the roadmap for this year. I am going to focus on improving Toro to run light microservices on VMs. To achieve this, I am going to work on:
1. Reduce the generated image
2. Speed up the booting time
3. Support to microVM, e.g., Firecracker, NEMU.
Most microVM solutions avoid the using of emulated devices so it is very important to add support for more VirtIO devices. Therefore the next feature is 4) to develop the VirtIO block driver (see issue #158). In addition, Toro is going to support classic berkeley sockets for microservices that performs intense IO. Such a feature is currently developed in issue #268.

Monday, December 31, 2018

Toro in 2018 was great!

Hello folks! the year is ending and I want to summarise what happen in Toro during 2018. Let me begin with the events. We had the opportunity to present Toro in FOSDEM'18 and Open Source Summit Europe 2018. Both conferences were very interesting and we got tone of feedback. Regarding with publications, I had the pleasure to write an article for the Blaise Pascal Magazine. I hope to continue by doing this in 2019. This year was particularly interesting in new features. Here I list a few of them: 
- Virtio network driver
- Fat driver to enable support of the qemu's vfat interface
- Support for try..except statement 
- Support for the -O2 flag when kernel is compiled
- Optimisation of the cpu usage during idle loop
- Optimisation of the booting time and the size of the generated image. This is going to be presented in FOSDEM'19.
But how 2019 will be for Toro? Toro is perfect for microservices and during 2019 we will show that. Toro is going to support both blocking and non-blocking sockets. The former for microservices that do IO and the latter for microservices that do not need to block to answer requests. Toro is going to support more Virtio drivers, e.g, block devices, serial devices. Following that work, I am investigating the porting of Toro to solutions that propose the use of microVMs like firecracker or NEMU. Roughly speaking, these solutions propose a reduced device model in which most of the devices are not emulated and only VirtIO devices are supported. These solutions have several benefits like small footprint and fast booting thus making these solutions perfect to host microservices.

Have a nice 2019!

Matias. 

Tuesday, December 18, 2018

Toro will be at FOSDEM'19!

Hello Folks! I have the pleasure to present Toro at FOSDEM'19. This will be my third presentation. This time I am going to talk about how Toro is optimized to speed up the booting time. This is particularly interesting in the context of microservices. When a VM is used to host a microservice, it is powered on/off on demand. This allows cloud providers to save resources like CPU and memory. However, this requires that the VM is up and running very quickly. In this talk, I discuss three approaches that aim to speed up the initialization of VMs. These approaches are NEMU, Qboot, and Firecracker (see abstract here). During the talk, I use these solutions in Toro and I discuss benefits and drawbacks. 

Sunday, August 26, 2018

Toro supports for try...except statement and user exceptions!

Hello folks! I just merged to master the commits to support the try..except statement. This allows user applications to handle exceptions. To do this, I had to switch to Linux RTL, which involved a lot of changes. I updated the wiki in case you want to try. In Windows, It is necessary to get a Freepascal cross-compiler from Windows to Linux that very well explained in the wiki. I hope you enjoy!

Matias Vara   

Thursday, August 16, 2018

Toro will be present in OSSEU18!

Hello folks! I am very happy to announce that Toro will be in OSSEU'18! For further information check http://sched.co/FxYD. I hope to see you all there!

Cheers, Matias.

Sunday, June 24, 2018

Reducing CPU usage on Toro guests, "The numbers"

Hello folks! I experimented around the last improvement of Toro regarding with reducing the energy consumption. I want to thank my very closed friend Cesar Bernardini for the experiments. In the tests, we compare an Ubuntu guest with a Toro guest on Qemu. We set up a 2 core machine with 256MB per core. To bench each kernel, we generate N http requests and then we stop, we repeat it every X time. Then, we measure the CPU usage of the Qemu process by using top. Then, we get the following graphs:

Toro without any improvement:
In this graph, you can see that Qemu's process is at 100% all the time. 

Toro with improvements:

With the improvements, Qemu's process is at 100%  only when traffic is received.  

Ubuntu guest:

When Ubuntu is on, i.e., when traffic is received, Qemu's process uses between ~40%...60%, then , when there is no trafic, cpu usage downs to around ~ 0%..15%.

In the next experiments, we incress the number of messages.

Toro guest:
When the number of messages is incressed, the Toro guest footprint does not change.

Ubuntu guest:
In the case of a Ubuntu guest, the cpu usage of the Qemu process reaches the 100% during traffic. This means that Ubuntu is correctly scaling the cpu usage on demand. 

Take Away Lessons:

- In production, CPU usage of Guests is important because the VCPUs are a shared resource
- The approach in Toro has reduced the CPU usage in a half, however a an overall power management solution must also scale the CPU, i.e., processor in P-State
- The approaches may depend on the hypervisor and its ability to emulate/virtualize the instructions related with power consumption, e.g., mwat/mcontrol

Thursday, May 24, 2018

Booting Toro in 123ms on QEMU-Lite

Hello folks! I have spent some time to port Toro to QEMU-Lite. This work is still very experimental and can be found in the branch feature-bootforqemu-lite. If you want to know more about QEMU-Lite check this great presentation. Roughly speaking,  QEMU-Lite is an improved version of QEMU, which is dedicated to boot a Linux kernel guest. QEMU-Lite improves the booting time by removing unnecessary steps in the booting process. For example, it removes the BIOS and the need of a bootloader. When QEMU jumps to the kernel code, the microprocessor is already in 64 bit long mode with paging enabled. To make Toro works on QEMU-Lite, I have to remove the whole bootloader and replace it by a simpler one that supports the Multiboot standar. So far I am only able to boot the application ToroHello.pas that takes only 123ms to boot. Future work will be to support multiprocessing so stay tuned!

Cheers, Matias.

Friday, April 20, 2018

Easing the sharing of files between host and guest by using the Qemu Fat feature

Hello folks! I have just committed the first version of a fat driver. This driver together with the vfat interface of Qemu eases the sharing of files between the guest and the host. This new feature relies on the mechanism of Qemu to present to a guest a fat partition from a directory in the host. This mechanism is enabled by passing "-drive file=fat:rw:ToroFiles", where ToroFiles is the path of a directory in the host machine. By doing so, Qemu presents to the guest a new block device in which there is a fat partition that includes all the file structure of the ToroFiles directory. Depending on some flags, the partition can be either fat32 or fat16. From the qemu's source code, it seems fat32 is not tested enough so I decided to support fat16 only. The main benefits of this mechanism is to ease the sharing of files between the guest and the host. The main drawback is you should not modify the directory while the guest is running because Qemu may get confused. To know more about this fetaure in qemu, you can visit https://en.wikibooks.org/wiki/QEMU/Devices/Storage. The commit that adds this feature can be found here https://github.com/MatiasVara/torokernel/commit/2de6631d10202f20db7cef61469ed9e795ed6954. For the moment, the driver allows only read operations. I expect to have writing operations soon.

Matias

Saturday, February 10, 2018

Docker image to compile Toro on Linux, Part II

In the first part of this post (here), I explained how to use a docker image to compile Toro. I worked a bit on this procedure and I modified CloudId.sh to make it use the container. To compile Toro by using CloudIt.sh, you need first to install docker and then follow these steps:
1. Pull the docker image from dokerhub
docker pull torokernel/ubuntu-for-toro
2. Clone torokernel git repo
3. Go to torokernel/examples and run:
./CloudIt.sh ToroHello 
If everything goes well, you will get ToroHello.img in torokernel/examples. In addition, if you have installed KVM, you will get an instance of a Toro guest that runs ToroHello.  

Enjoy!

Monday, February 05, 2018

Docker image to compile Toro

Hello folks! I just created a docker image to compile Toro kernel examples. You can find the image in https://hub.docker.com/r/torokernel/ubuntu-for-toro/. To try it, follow the steps:

1. Install docker. You can find a good tutorial in https://docs.docker.com/install/linux/docker-ce/ubuntu/#install-docker-ce-1

2. Once installed, in a command line run:

 docker pull torokernel/ubuntu-for-toro

3. Clone ToroKernel repository that will be used to provide the code to be compiled:

git clone https://github.com/MatiasVara/torokernel.git

and then move current directory to ./torokernel

4. In a command line, run:

sudo docker run -it -v $(pwd):/home/torokernel torokernel/ubuntu-for-toro bash 

This command returns a bash in which current directory, i.e., torokernel directory, is mounted at /home/torokernel. So now we can just go to /home/torokernel/examples and run:

wine c:/lazarus/lazbuild.exe ToroHello.lpi

This will compile and build ToroHello.img. When we are done in the Docker, we can Exit.

Enjoy!

Thursday, January 25, 2018

Toro supports Virtio network drivers!

Hello folks! the last three weeks, I have been working on adding support for virtio network drivers in Toro (see VirtIONet.pas). In a virtualisation environment, virtio drivers have many benefits:
- they perform better than e1000 or other emulated network card.
- they abstract away the hardware of the host thus enabling the drivers to work on different hardware.  
- they are an standard way to talk with network cards which is supported by many hypervisors like KVM, QEMU or VirtualBox. 
The way that virtio network cards work is quite simple. They are based on the notion of virtqueue. In the case of networking, network cards have mainly two queues: the reception queue and the transmission queue. Roughly speaking, each queue has two rings of buffers: the available buffers and the used buffers. To provide a buffer to the device, the driver puts buffers in the available ring, then the device consumes and put them in the used ring. For example, in the case of the reception queue, the driver feeds the device by putting buffers in the available queue. Then, when a packet arrives, the device takes buffers from the available queue, writes the content and puts them in the used queue. You can find many bibliography on internet. However, I would recommend this post which also proposes the code C of the drivers. I think testing is the harder part. I found different behaviours depending on where are you testing, e.g., KVM, QEMU. For example, in KVM, if you don't set the model=virtio-net, the driver just does not work. To test, I basically have my own version of QEMU which prints all the logs straight to stdout. Also, Wireshark helps to trace the traffic and find duplicate packets or other kind of misbehaviour. The good part: when you are done with one virtio network driver, you can easily use it as a template because all virtio driver are very similar. I had no time yet to compare with e1000 but I am expecting good numbers :)

Cheers, Matias. 
          

Friday, January 12, 2018

Toro compiles with -O2 option

Hello everyone, I spent the last days by trying to compile Toro kernel with the -O2 option. This option tells the compiler to optimise the execution by using the CPU registers. This means that the compiler would keep the data in registers instead of the memory. This supposes a huge improvement in the performance since the access to registers is wide more faster than memory. I give for details about this issue in https://github.com/MatiasVara/torokernel/issues/135. The problem that I faced was that some assembler functions were violating the ABI of Windows x64. In other words, these functions were not restoring the value of registers that the compiler uses. In addition, the IRQ handlers were not doing so either. After fixing all these issues, I am able to compile with -O2 and the result is already in master. A simple comparison of TorowithFilesystem.pas shows an speed up of ~12%!!!! Compilation with -O3 is also posible but I did not make any benchmark.

Matias

Sunday, December 24, 2017

Toro compiled with FPC 3.0.4 and Lazarus 1.8.0

I am glad to announce that Toro compiles smoothly by using FPC 3.0.4 from Lazarus 1.8.0. There is only one issue when I want to run qemu from the IDE which I am going to report. Apart of that, no other issue has been observed.

Matias 

Friday, December 15, 2017

Toro in FOSDEM'18!

I am glad to announce that Toro will be in FOSDEM'18. This time I will talk about the improvement in the scheduler to reduce the impact of idle loops thus reducing the cpu usage of VMs. For more information, you can find the abstract here.

Friday, December 08, 2017

Reducing CPU usage on VMs that run Toro

Last days I worked on reducing the CPU usage of Toro. I observed that VMs that run Toro consume 100% of CPU which makes any solution based on Toro impossible for production. I identified four situations in which an idle loop is the issue:

  1. During a spin locks
  2. When there is no a thread in ready state 
  3. When there is no thread 
  4. When threads only poll a variable
Cases 1, 2 and 3 are in the kernel code. However, case 4 is when a user thread does idle work by polling a variable. So in this case, the solution would be harder since the scheduler has to figure out that the thread is only polling. Intel proposes different mechanisms to reduce the impact of idle loops [1, 2]. In particular, I was interested on the use of mwait/monitor instructions. However, I found this is not very well supported on all Hypervisors. So I have to base on the instructions hlt (halt) and pause. I want to highlight that hlt is a privilege instruction so only ring0 can use it. However, since in Toro both the kernel and the application run in ring0, hlt can be used either by the kernel or the user. Following cases correspond with the use by the kernel.

First, I tackled case 1 by introducing the pause instruction inside the loop. This relaxes the CPU when a thread is getting exclusive access to a resource. Cases 2 and 3 were improved by using hlt which just halts the CPU until next interruption. To tackle case 4, I proposed two APIs to tell the scheduler when a thread is polling a variable. When scheduler figures out that all threads in a core are polling a variable, it just turns the core off.

I tested this in my baremetal host (4 cores, 8 GB, 2GHz) in Scaleway with KVM and a VM running Toro. I also installed Monitorix to monitor the state of the host. To stress the system, I generate http traffic and monitor the CPU usage. A few seconds after the stress stopped, the CPU usage of the qemu process is only about 1%. This usage goes up and down depending on the stress. By topping, I get a patter like this:   

CPU%
6.6    0.6     0:46.92 qemu-system-x86
63.8  0.6     0:48.84 qemu-system-x86 (stress)
99.7  0.6     0:51.84 qemu-system-x86 (stress)
45.5  0.6     0:53.21 qemu-system-x86 (stress)
2.3    0.6     0:53.28 qemu-system-x86
4.0    0.6     0:53.40 qemu-system-x86
6.6    0.6     0:53.60 qemu-system-x86

This is not always the case and sometimes Toro takes longer to turn off. This may happen when a socket is not correctly closed and it ends up by timeout. I need to experiment more to measure how much reaction the system losses when the core is halted. This recent work, however, seems very promising!

[1] https://www.contrib.andrew.cmu.edu/~somlo/OSXKVM/IdleTalk.pdf
[2] Intel Volume 3, section 8.10.2 and 8.10.4

Thursday, November 30, 2017

Running Toro on top of VirtualBox

I am glad to announce that Toro is supported by VirtualBox. After bug fixing in the bootloader, Toro perfectly boots on VirtualBox:



This way Toro can run on Hyper-V, VMWare, QEMU, KVM and now VirtualBox. 

Wednesday, November 08, 2017

Running Toro on top of Hyper-V

Hello folks! I am working on the bootloader of Toro to be able to boot on HyperV. First experiments seem very interesting! I could run the example ToroHello on top of HyperV.



I found that the problem was the use of mm0 register in the bootloader. While this works well in KVM or QEMU, in Hyper-V the system hangs/crashes. By using a general purpose register like EBX, the problem is fixed and the kernel is loaded. I could not figure out why this is happening. Also, it was a bit tricky to wake up the other cores but, finally, I got them up. In the the near future, Hyper-V will be installed on every Windows in this context Toro can be an option of containers in a Windows environment. 

Saturday, July 15, 2017

The four steps to deploy a Toro application in the Cloud


Hi folks! Toro makes very simple the deployment of standalone-applications on Clouds. I just committed the script SetupCloudGuest.sh which is meant to install the needed tools to run a Toro application in a VM and then makes it public on internet by relying on port forwarding. These are the four steps:
1) Get a machine in Scaleway that will be used to run KVM and the Toro's guests.
2) Clone Toro repo. This contains the needed scripts to setup the host.
3) Run torokernel/tests/SetupCloudGuest.sh. This installs KVM and other needed tools.
3) Download the image that run the guest. For example, you can download TorowithFileSystem.img from here. The image must be copied to torokernel/tests. This workflow supposes that the development is done in other machine and the Toro image is just copied from your local machine to the host.
4) Finally, run Cloudit.sh TorowithFileSystem onlykvm
 
That's all folks! You can get the guest's console by using a VNC Viewer to port 5900, or connect to port 80 of guest by connecting to port 80 of host.

Sunday, June 18, 2017

Deploying a TORO guest in the Cloud

Hi folks! this weekend I deployed a TORO guest in the Scaleway's Cloud. This can be checked by connecting a VNC client to  51.15.142.20:5900

Figure1. VNC Viewer on Windows connected to the Toro guest.

What you see is a TORO guest running on top of KVM. This example runs the ToroHello.pas which just prints a message to the screen. I am working to add more examples that use the filesystem and the network stack. Here I am writing down all the procedure to compile Toro in Ubuntu and run the example on KVM. This example, yet very simple, shows how easy is to deploy an application which is compiled within Toro kernel and run it in a VM without interference of an OS.