Optimization of the rename case.

Rename can be VERY time consuming. One of the reasons is the 4 recursion
level depth of lfs_dir_traverse() seen if a compaction happened during the
rename.

lfs_dir_compact()
  size computation
    [1] lfs_dir_traverse(cb=lfs_dir_commit_size)
         - do 'duplicates and tag update'
       [2] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1])
           - Reaching a LFS_FROM_MOVE tag (here)
         [3] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1]) <= on 'from' dir
             - do 'duplicates and tag update'
           [4] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[3])
  followed by the compaction itself:
    [1] lfs_dir_traverse(cb=lfs_dir_commit_commit)
         - do 'duplicates and tag update'
       [2] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1])
           - Reaching a LFS_FROM_MOVE tag (here)
         [3] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1]) <= on 'from' dir
             - do 'duplicates and tag update'
           [4] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[3])

Yet, analyse shows that levels [3] and [4] don't perform anything
if the callback is lfs_dir_traverse_filter...

A practical example:

- format and mount a 4KB block FS
- create 100 files of 256 Bytes named "/dummy_%d"
- create a 1024 Byte file "/test"
- rename "/test" "/test_rename"
- create a 1024 Byte file "/test"
- rename "/test" "/test_rename"
This triggers a compaction where lfs_dir_traverse was called 148393 times,
generating 25e6+ lfs_bd_read calls (~100 MB+ of data)

With the optimization, lfs_dir_traverse is now called 3248 times
(589e3 lfs_bds_calls (~2.3MB of data)

=> x 43 improvement...
This commit is contained in:
Arnaud Mouiche
2021-12-01 14:27:45 +01:00
committed by Christopher Haster
parent 9c7e232086
commit c2fa1bb7df

33
lfs.c
View File

@@ -818,6 +818,39 @@ static int lfs_dir_traverse(lfs_t *lfs,
} else if (lfs_tag_type3(tag) == LFS_FROM_MOVE) {
uint16_t fromid = lfs_tag_size(tag);
uint16_t toid = lfs_tag_id(tag);
// There is a huge room for simple optimization for the rename case
// where we can see up to 4 levels of lfs_dir_traverse recursions
// when compaction happened (for example):
//
// >lfs_dir_compact
// [1] lfs_dir_traverse(cb=lfs_dir_commit_size)
// - do 'duplicates and tag update'
// [2] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1])
// - Reaching a LFS_FROM_MOVE tag (here)
// [3] lfs_dir_traverse(cb=lfs_dir_traverse_filter,
// data=tag[1]) <= on 'from' dir
// - do 'duplicates and tag update'
// [4] lfs_dir_traverse(cb=lfs_dir_traverse_filter,
// data=tag[3])
//
// Yet, for LFS_FROM_MOVE when cb == lfs_dir_traverse_filter
// traverse [3] and [4] don't do anything:
// - if [2] is supposed to match 'toid' for duplication, a preceding
// ERASE or CREATE with the same tag id will already have stopped
// the search.
// - if [2] is here to update tag value of CREATE/DELETE attr found
// during the scan, since [3] is looking for LFS_TYPE_STRUCT only
// and call lfs_dir_traverse_filter with LFS_TYPE_STRUCT attr
// wheras lfs_dir_traverse_filter only modify tag on CREATE or
// DELETE. Consequently, cb called from [4] will never stop the
// search from [2].
// - [4] may call lfs_dir_traverse_filter, but with action on a
// tag[3] pointer completely different from tag[1]
if (cb == lfs_dir_traverse_filter) {
continue;
}
// note: buffer = oldcwd dir
int err = lfs_dir_traverse(lfs,
buffer, 0, 0xffffffff, NULL, 0,
LFS_MKTAG(0x600, 0x3ff, 0),