Implement readahead hint for low level driver functions #16

jcdubois · 2020-12-07T19:25:02Z

If one application reads data from files by chunks of 4KB, at this time, all IO toward the persistent storage is synchronous to the call and will be started over for the next 4KB chunk.
Now, assuming we have a relatively intelligent and autonomous persistent storage peripheral like a eMMC controller able to do DMA transfers in the background, with the present usage scheme, we could not really leverage this ability because of the synchronous use of the persistent storage done by Reliance.
For example, if the application had some processing work to do on the present 4KB chunk it would be beneficial (performance wise) if we could trigger the retrieval of the next 4KB chunk in the background (by the intelligent persistent storage peripheral) so that it is available (or almost available) when the application is done with the present block.
I have been considering adding this behavior only to the low level driver functions without particular support/hint from reliance but I think that implementing this behavior blindly on all "read calls" could be counterproductive as these API are also used to retrieve inode information and other metadata on the file system (and these are mostly short single block reads).
So I was wondering if it would make sense to add either some parameters to some actual read functions or even some new functions to "give a hint" to the lower driver API that it would be beneficial to initiate some "readahead" behavior if supported by the hardware.
Do you think such feature could be beneficial to Reliance and would you be interested to add it to it? If so, assuming I could make prototype for it, could you give some guidance on the way you would prefer it to be implemented?

danielrlewis · 2020-12-08T10:17:23Z

Hi @jcdubois, readahead has come up before but we haven't yet pursued it.

Do you think such feature could be beneficial to Reliance and would you be interested to add it to it?

I'll raise the issue. FYI, we are in the middle of a week-long company function, but we'll get back to you when we can.

I have been considering adding this behavior only to the low level driver functions without particular support/hint from reliance but I think that implementing this behavior blindly on all "read calls" could be counterproductive as these API are also used to retrieve inode information and other metadata on the file system (and these are mostly short single block reads).

Agreed. Doing it right would require core support.

So I was wondering if it would make sense to add either some parameters to some actual read functions or even some new functions to "give a hint" to the lower driver API that it would be beneficial to initiate some "readahead" behavior if supported by the hardware. [...] If so, assuming I could make prototype for it, could you give some guidance on the way you would prefer it to be implemented?

Yes, I think that would make sense. I think there would be a new function like RedOsBDevReadAhead() (or maybe just a new flag for RedOsBDevRead()) that would tell the block device to perform a readahead operation if supported (if it's not supported, the call would do nothing and return success). The core would call this function on sectors that it anticipates are likely to be read next. For example, if logical block 0 in a file is read, the core might issue a readahead for the sectors comprising logical block 1. That logic belongs in the core, since the core knows whether logical block 1 is allocated and where it is allocated.

If we wanted to go further, we could have the POSIX-like layer keep track of the access pattern in the file descriptor handle, such that readahead was used for sequential reads but not random-access reads.

Those are my initial thoughts. I'll think about this further if there's interest from the rest of the team.

danielrlewis · 2020-12-19T00:55:18Z

Hi @jcdubois, an update on this. I will take up the task of creating a prototype for readahead, so that I can gauge the performance improvement potential. Assuming that I am not interrupted by a higher priority task, I should be able to complete the prototype next week and have some results.

Do you have a environment where you could test the performance benefits of readahead? If so, I could put the prototype code on a GitHub branch so you could also try it out.

jcdubois · 2020-12-19T10:58:25Z

Hi @danielrlewis, this is a nice early Christmas present you are preparing !!!

Just now, I am on Christmas holiday (good for me!) and away from the office so I have no access to our platform to test things for real. I'll be back at work in January and we could assess things then.

However I am interested by the changes you are planning and I would certainly review them and give feedback (relatively to our planned use case) if you publish code on a branch on Github.

Merry Christmas to all.

danielrlewis · 2021-01-09T01:22:59Z

A quick update before the weekend: While this did get preempted by other projects, I was able to work on it this week. I have implemented a prototype and preliminary performance results look promising. I should (hopefully) have more to share next week...

jcdubois · 2021-01-09T21:19:43Z

This is a great news.
I am very curious about the performance improvement you are recording.

For context, see issue #16 in the Reliance Edge repository on GitHub.

This allows for functional testing of the readahead feature on a Linux host machine. It is not intended for performance testing. The implementation is somewhat messy. For context, see issue #16 in the Reliance Edge repository on GitHub.

Readahead is only used when the file access has been sequential. This is tracked in the POSIX-like API file handles. The FSE API does not support readahead at this time. Additionally, the core determines whether readahead makes sense for a given sequential read call: see Readahead() in inodedata.c. For simplicity, when readahead is performed, the amount requested is always a single block of data. It might be worthwhile to make this a configurable value. For context, see issue #16 in the Reliance Edge repository on GitHub.

danielrlewis · 2021-01-12T23:28:59Z

@jcdubois:

I am very curious about the performance improvement you are recording.

I measured the performance of the readahead prototype on a Renesas R-Car H3 Starter Kit (H3SK) board. I used this board for convenience's sake, since a) I had one at my home office; and b) the RTOS storage drivers for it are asynchronous, which made implementing readahead relatively easy. However, it should be noted that the board has an ARMv8 application processor, which is fast by embedded standards—not the best environment to showcase readahead, since that feature might show more benefit on a slow CPU where code execution overhead is more substantial.

In my measurements, readahead did not have any appreciable impact when reading sequentially in a tight loop: i.e., when reading the data, doing nothing with it, and immediately reading the next portion. Reading 4 KB at a time (which was the file system block size), I saw (on this particular SD card) about 1760 KB/sec, with or without enabling readahead.

However, if the test paused between reads, to simulate "doing something" with the data, then readahead resulted in substantially faster throughput:

1 millisecond pause: 3460 KB/sec
2 millisecond pause: 17857 KB/sec
3 millisecond pause: 276523 KB/sec

(I also tested with these pauses when readahead was disabled, but as expected the pause did not increase throughput in that case.)

Readahead results for pauses >3 milliseconds were the same as 3 milliseconds. If the time between reads is longer than the time necessary to read the data from the SD card, then each read is very fast because it's just a copy between RAM buffers.

The "pause" for the above numbers was a sleep, but an active pause (no sleep, continuing to use the CPU) yielded similar results. For instance, invoking printf() between each read yielded results similar to the 1 millisecond pause.

Since reading in a tight loop—one read after another with no pauses—did not show any improvement, I conclude that on this board, with its fast CPU, the amount of CPU time to execute a read is insignificant in comparison to the I/O time for a read. I do wonder whether that would still be the case on a board with a slower microcontroller processor. However, I don't have a handy environment to test that (many microcontrollers have SD/MMC drivers that do DMA transfers, but that is buried deep in the guts of the driver, and substantial driver refactoring would be required to start the DMA transfer without waiting for it to finish, as would be needed for asynchronous readahead).

In the prototype, each readahead request is for a block's worth of data. I also tested sequential read performance with I/O sizes less than the block size, with similar results. I did not test I/O sizes larger than the block size, but presumably the benefit of readahead would be smaller, since only a portion of each request could be satisfied from the readahead data. Similarly, a smaller benefit would be expected if the block size is larger than the sector size and it's only possible to initiate a background transfer one sector's worth of data.

The code for the prototype can be found on the readahead-prototype branch. I should caveat that this is prototype code, which has not been extensively tested or reviewed. If you want to try it out, you will need to implement the RedOsBDevReadahead() function which the core uses to hint that a read is expected. I cannot share the implementation that I was using (the H3SK board runs a proprietary RTOS that, for GPLv2 licensing reasons, cannot be supported by the open source version of Reliance Edge), but I think it should be fairly straightforward to implement, provided that the storage drivers have an interface that allows for background transfers. If you do try it, I would be interested in seeing any performance results that you gather.

The branch also includes a Linux implementation which simulates readahead using a separate thread. It's somewhat messy and not intended for performance testing but it does allow the feature to be exercised in a Linux development environment.

I will be sharing these results with the rest of the Reliance Edge team for discussion. I will keep you posted...

jcdubois · 2021-01-13T20:40:21Z

OK, so on your platform, transferring 4KB seems to takes 2,2 ms more or less.
With the read-ahead hint patch you can use these 2,2 ms to do other stuff with your processor (here you only wait) without impacting your filesystem throughput as long as your processing is below these 2,2 ms.
If your processing exceed 2,2 ms then you will start to see your application file system throughput lowering but still less than if you were using only synchronous read calls.
Overall the change seems to be very beneficial. This is great! We will have to try it on our side.

danielrlewis added a commit that referenced this issue Jan 12, 2021

readahead: define config macro and osbdev func

e66ade5

For context, see issue #16 in the Reliance Edge repository on GitHub.

danielrlewis added the enhancement label Sep 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement readahead hint for low level driver functions #16

Implement readahead hint for low level driver functions #16

jcdubois commented Dec 7, 2020

danielrlewis commented Dec 8, 2020

danielrlewis commented Dec 19, 2020

jcdubois commented Dec 19, 2020

danielrlewis commented Jan 9, 2021

jcdubois commented Jan 9, 2021

danielrlewis commented Jan 12, 2021

jcdubois commented Jan 13, 2021

Implement readahead hint for low level driver functions #16

Implement readahead hint for low level driver functions #16

Comments

jcdubois commented Dec 7, 2020

danielrlewis commented Dec 8, 2020

danielrlewis commented Dec 19, 2020

jcdubois commented Dec 19, 2020

danielrlewis commented Jan 9, 2021

jcdubois commented Jan 9, 2021

danielrlewis commented Jan 12, 2021

jcdubois commented Jan 13, 2021