This involved some minor tweaks for the various types of tests, added
predicates to the test framework (necessary for test_entries and
test_alloc), and cleaned up some of the testing semantics such as
reporting how many tests are filtered, showing permutation config on
the result screen, and properly inheriting suite config in cases.
The idea behind emubd (file per block), was neat, but doesn't add much
value over a block device that just operates on a single linear file
(other than adding a significant amount of overhead). Initially it
helped with debugging, but when the metadata format became more complex
in v2, most debugging ends up going through the debug.py script anyways.
Aside from being simpler, moving to filebd means it is also possible to
mount disk images directly.
Also introduced rambd, which keeps the disk contents in RAM. This is
very useful for testing where it increases the speed _significantly_.
- test_dirs w/ filebd - 0m7.170s
- test_dirs w/ rambd - 0m0.966s
These follow the emubd model of using the lfs_config for geometry. I'm
not convinced this is the best approach, but it gets the job done.
I've also added lfs_ramdb_createcfg to add additional config similar to
lfs_file_opencfg. This is useful for specifying erase_value, which tells
the block device to simulate erases similar to flash devices. Note that
the default (-1) meets the minimum block device requirements and is the
most performant.
Aside from reworking the internals of test_.py to work well with
inherited TestCase classes, this also provides the two main features
that were the main reason for revamping the test framework
1. ./scripts/test_.py --reentrant
Runs reentrant tests (tests with reentrant=true in the .toml
configuration) under gdb such that the program is killed on every
call to lfs_emubd_prog or lfs_emubd_erase.
Currently this just increments a number of prog/erases to skip, which
means it doesn't necessarily check every possible branch of the test,
but this should still provide a good coverage of power-loss tests.
2. ./scripts/test_.py --gdb
Run the tests and if a failure is hit, drop into GDB. In theory this
will be very useful for reproducing and debugging test failures.
Note this can be combined with --reentrant to drop into GDB on the
exact cycle of power-loss where the tests fail.
- Reworked how permutations work
- Now with global defines as well (apply to all code)
- Also supports lists of different permutation sets
- Added better cleanup in tests and "make clean"
This is the start of reworking littlefs's testing framework based on
lessons learned from the initial testing framework.
1. The testing framework needs to be _flexible_. It was hacky, which by
itself isn't a downside, but it wasn't _flexible_. This limited what
could be done with the tests and there ended up being many
workarounds just to reproduce bugs.
The idea behind this revamped framework is to separate the
description of tests (tests/test_dirs.toml) and the running of tests
(scripts/test.py).
Now, with the logic moved entirely to python, it's possible to run
the test under varying environments. In addition to the "just don't
assert" run, I'm also looking to run the tests in valgrind for memory
checking, and an environment with simulated power-loss.
The test description can also contain abstract attributes that help
control how tests can be ran, such as "leaky" to identify tests where
memory leaks are expected. This keeps test limitations at a minimum
without limiting how the tests can be ran.
2. Multi-stage-process tests didn't really add value and limited what
the testing environment.
Unmounting + mounting can be done in a single process to test the
same logic. It would be really difficult to make this fail only
when memory is zeroed, though that can still be caught by
power-resilient tests.
Requiring every test to be a single process adds several options
for test execution, such as using a RAM-backed block device for
speed, or even running the tests on a device.
3. Added fancy assert interception. This wasn't really a requirement,
but something I've been wanting to experiment with for a while.
During testing, scripts/explode_asserts.py is added to the build
process. This is a custom C-preprocessor that parses out assert
statements and replaces them with _very_ verbose asserts that
wouldn't normally be possible with just C macros.
It even goes as far as to report the arguments to strcmp, since the
lack of visibility here was very annoying.
tests_/test_dirs.toml:186:assert: assert failed with "..", expected eq "..."
assert(strcmp(info.name, "...") == 0);
One downside is that simply parsing C in python is slower than the
entire rest of the compilation, but fortunately this can be
alleviated by parallelizing the test builds through make.
Other neat bits:
- All generated files are a suffix of the test description, this helps
cleanup and means it's (theoretically) possible to parallelize the
tests.
- The generated test.c is shoved base64 into an ad-hoc Makefile, this
means it doesn't force a rebuild of tests all the time.
- Test parameterizing is now easier.
- Hopefully this framework can be repurposed also for benchmarks in the
future.
Sometimes small, single line code change hides behind it a complicated
story. This is one of those times.
If you look at this diff, you may note that this is a case of
lfs_dir_fetchmatch not correctly handling a tag that invalidates a
callback used to search for some condition, in this case a search for a
parent, which is invalidated by a later dir tag overwritting the
previous dir pair.
But how can this happen? Dir-pair-tags are only overwritten during
relocations (when a block goes bad or exceeds the block_cycles config
option for dynamic wear-leveling). Other dir operations create new
directory entries. And the only lfs_dir_fetchmatch condition that relies
on overwrites (as opposed to proper deletes) is when we need to find a
directory's parent, an operation that only occurs during a _different_
relocation. And a false _positive_, can only happen if we don't have a
parent. Which is really unlikely when we search for directory parents!
This bug and minimal test case was found by Matthew Renzelmann. In a
unfortunate series of events, first a file creation causes a directory
split to occur. This creates a new, orphaned metadata-pair containing
our new file. However, the revision count on this metadata-pair
indicates the pair is due for relocation as a part of wear-leveling.
Normally, this is fine, even though this metadata-pair has no parent,
the lfs_dir_find should return ENOENT and continue without error.
However, here we get hit by our fetchmatch bug. A previous, unrelated
relocation overwrites a pair which just happens to contain the block
allocated for a new metadata-pair. When we search for a parent,
lfs_dir_fetchmatch incorrectly finds this old, outdated metadata pair
and incorrectly tells our orphan it's found its parent.
As you can imagine the orphan's dissapointment must be immense.
So an unfortunately timed dir split triggers a relocation which
incorrectly finds a previously written parent that has been outdated
by another relocation.
As a solution we can outdate our found tag if it is overwritten by
an exact match during lfs_dir_fetchmatch.
As a part of this I started adding a new set of tests: tests/test_relocations,
for aggressive relocations tests. This is already by appended to by
another PR. I suspect relocations is relatively under-tested and is
becoming more important due to recent improvements in wear-leveling.
The superblock entry takes up id 0 in the root directory (not all
entries are files, though currently the superblock is the only
exception). Normally, reading a directory correctly skips the
superblock and only reports non-superblock files.
However, this doesn't work perfectly for lfs_dir_seek, which tries
to be clever to not touch the disk.
Fortunately, we can fix this by adding an offset for the superblock.
This will only work while the superblock is the only non-file entry,
otherwise we would need to touch the disk to properly seek in a
directory (though we already touch the disk a bit to get dir-tails
during seeks).
Found by jhartika
Stop proactively relocate blocks during migrations, this can cause a number of
failure states such: clobbering the v1 superblock if we relocate root, and
invalidating directory pointers if we relocate the head of a directory. On top
of this, relocations increase the overall complexity of lfs_migration, which is
already a delicate operation.
This is caused by dir->head not being updated when dir->m.pair may be.
This causes the two to fall out of sync and later dir rewinds to fail.
This bug stems all the way back from the first commits of littlefs, so
it's surprising it has avoided detection for this long. Perhaps because
lfs_dir_rewind is not used often.
In my point of view, file updates will commit to filesystem only when
sync or close. There is a extra word 'no' here.
Fixes: bdff4bc59e ("Updated DESIGN.md to reflect v2 changes")
Signed-off-by: liaoweixiong <liaoweixiong@allwinnertech.com>
To ensure 16 bit devices do not invalidly truncate lfs_file_write return codes, change
the return variable to be lfs_ssize_t which is the lfs_file_write return code and
cast to int if it is a negative error code.
lfs_dir_find returns either a negative return code or a tag.
For 32 bit machines with int as 32 bits this co-incides, but for smaller
bit processors, we need to ensure a 32 bit value is returned so change
the return type to lfs_stag_t.
Build warnings exist on a gcc based 16 bit compiler. Cast relevant types
to fix.
littlefs/lfs.c: In function 'lfs_gstate_xororphans':
littlefs/lfs.c:355:5: warning: left shift count >= width of type
littlefs/lfs.c: In function 'lfs_dir_fetchmatch':
littlefs/lfs.c:849:17: warning: left shift count >= width of type
littlefs/lfs.c: In function 'lfs_dir_commitcrc':
littlefs/lfs.c:1278:9: warning: left shift count >= width of type
This reverts commit fdd239fe21.
Bypassing cache turned out to be a mistake which causes more problems
than it solves. Device driver should deal with alignment if this is
required - trying to do that in a file system is not a viable solution
anyway.
This is a result of feedback that the current release notes made it too
difficult to see what changes happened on patch releases. From my
experience as well it became difficult to chase down which release a
commit landed on.
The risk is that this creates additional noise, both for the release
page and for user notifications. I am open to feedback if this causes a
problem.
Other tweaks on the CI side, these came from iteration with the same
scheme for coru and equeue:
- Changed version branch updates to be atomic (vN and vN-prefix). This
makes it a bit easier to fix if one of the pushes fails due to a rogue
branch with the same name.
- Added GEKY_BOT_DRAFT as a CI macro that can optionally switch between
only creating drafts or immediately posting a release. The default is
what I will be trying with littlefs which is to draft minor/major
releases, but automatically create patch release.
The real benefit of automatic releases is to use on tiny repos that
don't really have an active maintainer. Though this is definitely no
longer the case with littlefs, and I'm happy it has gained this much
attention.
When using lfs_file_truncate() to make a file shorter the file block and
off were incorrectly positioned at the new end, resulting in invalid
data accessed when reading. Lift the seek pointer restoration to apply
to both increasing and reducing truncates.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>
The difference between 0xffffffff and 0xfffffffe is too subtle. Use
names that reflect what the value represents.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>