thirdparty-littlefs

mirror of https://github.com/eledio-devices/thirdparty-littlefs.git synced 2025-10-31 00:32:38 +01:00

Author	SHA1	Message	Date
Christopher Haster	cfe779fc08	Merge pull request #508 from littlefs-project/fix-sanity-check Moved sanity check in lfs_format after compaction	2021-01-18 11:47:23 -06:00
Christopher Haster	0db6466984	Merge pull request #502 from mon/feature/meta_limits Add metadata_max config to help performance on devices with large blocks	2021-01-18 11:45:34 -06:00
Christopher Haster	10a08833c6	Moved lfs_mdir_isopen behind LFS_NO_ASSERT lfs_mdir_isopen goes unused if asserts are disabled, and this caused an "unused function" warning on Clang (curiously not on GCC since the function was static inline, commonly used for header-only functions). Also removed "inline" from the lfs_mdir_* functions as these involve linked-list operations and really shouldn't be inlined. And since they are static, inlining should occur automatically if there is a benefit. Found by dpgeorge	2021-01-18 11:41:18 -06:00
Christopher Haster	2b804537b0	Moved sanity check in lfs_format after compaction After a bit of tweaking in 9dde5c7 to write out all superblocks during lfs_format, additional writes were added after the sanity checking normally done at the end. This turned out to be a problem when porting littlefs, as it makes it easy for addressing issues to not get caught during lfs_format. Found by marekr, tristanclare94, and mjs513	2020-12-22 11:47:48 -06:00
Will	37f4de2976	Remove inline_files_max and lfs_t entry for metadata_max	2020-12-18 13:05:20 +10:00
Will	6b16dafb4d	Add metadata_max and inline_file_max to config We have seen poor read performance on NAND flashes with 128kB blocks. The root cause is inline files having to traverse many sets of metadata pairs inside the current block before being fully reconstructed. Simply disabling inline files is not enough, as the metadata will still fill up the block and eventually need to be compacted. By allowing configuration of how much size metadata takes up, along with limiting (or disabling) inline file size, we achieve read performance improvements on an order of magnitude.	2020-12-15 12:59:32 +10:00
Christopher Haster	6a7012774d	Renamed internal lfs_raw -> lfs_raw functions - Prefixing with raw is slightly more readable, follows common-prefix rule - Matches existing raw prefixes in testbd	2020-12-06 00:26:24 -06:00
Christopher Haster	5783eea0de	Merge pull request #490 from littlefs-project/fix-alloc-eviction Fix allocation-eviction issue when erase state is multiple of block_cycles+1	2020-12-04 00:49:09 -06:00
Christopher Haster	2bb523421e	Moved lfs_mlist_isopen checks into the API wrappers This indirectly solves an issue with lfs_file_rawclose asserting when lfs_file_opencfg errors since appending to the mlist occurs after open. It also may speed up some of the internal operations such as the lfs_file_write used to resolve unflushed data. The idea behind adopting mlist over flags is that realistically it's unlikely for the user to open a significant number of files (enough for big O to kick in). That being said, moving the mlist asserts into the API wrappers does protect some of the internal operations from scaling based on the number of open files.	2020-12-04 00:42:32 -06:00
Noah Gorny	7388b2938a	Deprecate LFS_F_OPENED and use lfs_mlist_isused instead Instead of additional flag, we can just go through the mlist.	2020-12-04 00:26:19 -06:00
Christopher Haster	ce425a56c3	Merge pull request #470 from renesas/SWFLEX-1517-littlefs-thread-safe-option Add thread safe wrappers	2020-12-03 23:47:32 -06:00
Christopher Haster	45afded784	Moved LFS_TRACE calls to API wrapper functions This removes quite a bit of extra code needed to entertwine the LFS_TRACE calls into the original funcions. Also changed temporary return type to match API declaration where necessary.	2020-12-03 23:46:59 -06:00
Christopher Haster	00a9ba7826	Tweaked thread-safe implementation - Stayed on non-system include for lfs_util.h for now - Named internal functions "lfs_functionraw" - Merged lfs_fs_traverseraw - Added LFS_LOCK/UNLOCK macros - Changed LFS_THREADSAFE from 1/0 to defined/undefined to match LFS_READONLY	2020-12-03 23:46:59 -06:00
Bill Gesner	fc6988c7c3	make raw functions static. formatting tweaks	2020-12-03 23:46:54 -06:00
Bill Gesner	d0f055d321	Squash of thread-safe PR cleanup - expand functions - add comment - rename functions - fix locking issue in format and mount - use global include - fix ac6 linker issue - use the global config file - address review comments - minor cleanup - minor cleanup - review comments	2020-12-03 23:41:01 -06:00
Christopher Haster	b9fa33f9bc	Merge pull request #480 from maximevince/master Add LFS_READONLY define, to allow smaller builds providing read-only mode	2020-12-03 23:06:00 -06:00
Maxime Vincent	754b4c3cda	Squash of LFS_READONLY cleanup - undef unavailable function declarations altogether - even less code, assert on write attempts - remove LFS_O_WRONLY and other flags when compiling with LFS_READONLY - do not annotate #endif, as requested - move ifdef before comments blocks, rework dangling opening bracket - ifdef file flags that are not needed in read-only mode - slight refactor - ifdef LFS_F_ERRED out as well	2020-12-03 23:03:29 -06:00
Christopher Haster	584eb26efc	Merge pull request #443 from NoahGorny/add-already-opened-assert Assert that the file isnt open in lfs_file_opencfg	2020-12-03 22:43:10 -06:00
Noah Gorny	008ebc37df	Add lfs_mlist_append/remove helper	2020-12-03 22:42:39 -06:00
Christopher Haster	66272067ab	Merge pull request #395 from gmpy/improve-write-performance lfs_bd_cmp() compares more bytes at one time	2020-12-03 22:34:47 -06:00
Christopher Haster	b8dcf10974	Changed lfs_dir_alloc to maximize block cycles for new metadata pairs Previously we only bumped the revision count if an eviction would occur immediately (and possibly corrupt littlefs). This works, but does risk an unoptimal superblock size if an almost-exhausted superblock was allocated during lfs_format. As pointed out by tim-nordell-nimbelink, we can align the revision count to maximize the number of block cycles without breaking the existing requirements of increasing revision counts. As an added benefit, littlefs's wear-leveling should behave more consistently after this change.	2020-11-28 22:46:11 -06:00
Christopher Haster	0aba71d0d6	Fixed single unchecked bit during commit verification This bug was exposed by the bad-block tests due to changes to block allocation, but could have been hit before these changes. In flash, when blocks fail, they don't fail in a predictable manner. To account for this, the bad-block tests check a number of failure behaviors. The interesting one here is "LFS_TESTBD_BADBLOCK_ERASENOOP", in which bad blocks can not be erased or programmed, and are stuck with the data written at the time the blocks go bad. This is actually a pretty realistic failure behavior, since flash needs a large voltage to force the electrons of the floating gates. Though realistically, such a failure would like corrupt the data a bit, not leave the underlying data perfectly intact. LFS_TESTBD_BADBLOCK_ERASENOOP is rather interesting to test for because it means bad blocks can end up with perfectly valid CRCs after a failed write, confusing littlefs. --- In this case, we had the perfect series of operations such that a test was repeatedly writing the same sequence of metadata commits to the same block, which eventually goes bad, leaving the block stuck with metadata that occurs later in the sequence. What this means is that after the first commit, the metadata block contained both the first and second commits, even though the loop in the test hadn't reached that point yet. expected actual .----------. .----------. \| commit 1 \| \| commit 1 \| \| crc 1 \| \| crc 1 \| \| \| \| commit 2 <-- (from previous iteration) \| \| \| crc 2 \| '----------' '----------' To protect against this, littlefs normally compares the written CRC against the expected CRC, but because this was the exact same data that it was going to write, this CRCs end up the same. Ah! But doesn't littlefs also encode the state of the next page to keep track of if the next page has been erased or not? Wouldn't that change between iterations? It does! In a single bit in the CRC-tag. But thanks to some incorrect logic attempting to avoid an extra condition in the loop for writing out padding commits, the CRC that littlefs checked against was the CRC immediately before we include the "is-next-page-erased" bit. Changing the verification check to use the same CRC as what is used to verify commits on fetch solves this problem.	2020-11-22 15:07:16 -06:00
Christopher Haster	d04c1392c0	Fixed allocation-eviction issue when erase state is multiple of block_cycles+1 This rather interesting corner-case arises in lfs_dir_alloc anytime the uninitialized revision count happens to be a multiple of block_cycles+1. For example, the source of the bug found by tim-nordell-nimbelink: rev = 2742492087 block_cycles = 100 2742492087 % (100+1) = 0 The reason for this weird block_cycles+1 case is due to a fix for a previous bug in `fe957de`. To avoid aliasing, which would cause metadata pairs to wear unevenly, block_cycles incremented to the next odd number. Normally, littlefs tweaks the revision count of blocks during lfs_dir_alloc in order to make sure evictions can't happen on the first compact. Otherwise, higher-level logic such as lfs_format would break. However, this wasn't updated with the aliasing fix in `fe957de`, so lfs_dir_alloc was only rounding the revision count to the nearest even number. The current fix is to change the logic in lfs_dir_alloc to explicitly check for the eviction condition and increment if eviction would occur. Found by tim-nordell-nimbelink	2020-11-22 00:40:58 -06:00
Christopher Haster	f215027fd4	Switched to CRC as seed collection function instead of xor As noted by gtaska, we are sitting on a better hash-combining function than xor: CRC. Previous issues with xor were solvable, but relying on xor for this isn't really worth the risk when we already have a CRC function readily available. To quote a study found by gtaska: https://michiel.buddingh.eu/distribution-of-hash-values > CRC32 seems to score really well, but its graph is skewed by the results > of Dataset 5 (binary numbers), which may or may not be too synthetic to > be considered a fair benchmark. But even if you substract the results > from that test, it does not fare significantly worse than other, > cryptographic hash functions.	2020-11-20 00:38:41 -06:00
Christopher Haster	1ae4b36f2a	Removed unnecessary randomization of offsets in lfs_alloc_reset On first read, randomizing the allocators offset may seem appropriate for lfs_alloc_reset. However, it ends up using the filesystem-fed pseudorandom seed in situations it wasn't designed for. As noted by gtaska, the combination of using xors for feeding the seed and multiple traverses of the same CRCs can cause the seed to flip to zeros with concerning frequency. Removed the randomization from lfs_alloc_reset, leaving it in only lfs_mount. Found by gtaska	2020-11-20 00:18:13 -06:00
Christopher Haster	480cdd9f81	Fixed incorrect modulus in lfs_alloc_reset Modulus of the offset by block_size was clearly a typo, and should be block_count. Interesting to note that later moduluses during alloc calculations prevents this from breaking anything, but as gtaska notes it could skew the wear-leveling distribution. Found by guiserle and gtaska	2020-11-20 00:02:19 -06:00
Noah Gorny	6303558aee	Use LFS_O_RDWR instead of magic number in lfs_file_* asserts	2020-11-19 01:51:39 +02:00
Noah Gorny	4bd653dd00	Assert that file/dir struct is not reused in lfs_file_opencfg/lfs_dir_open	2020-11-19 01:51:39 +02:00
Maxime Vincent	8e6826c4e2	Add LFS_READYONLY define, to allow smaller builds providing read-only mode	2020-10-28 16:09:13 +01:00
Bill Gesner	10ac6b9cf0	add thread safe wrappers	2020-09-17 23:41:20 +00:00
Deomid "rojer" Ryabkov	1b033e9ab6	Fix -Wmissing-field-initializers	2020-04-03 02:18:14 +01:00
Christopher Haster	7257681f5d	Merge branch 'master' into test-revamp	2020-03-31 18:24:54 -05:00
Christopher Haster	6121495444	Merge pull request #266 from FreddieChopin/revert-bypass-cache Revert "Don't bypass cache in `lfs_cache_prog()` and `lfs_cache_read()`"	2020-03-31 18:22:19 -05:00
Christopher Haster	5137e4b0ba	Last minute tweaks to debug scripts - Standardized littlefs debug statements to use hex prefixes and brackets for printing pairs. - Removed the entry behavior for readtree and made -t the default. This is because 1. the CTZ skip-list parsing was broken, which is not surprising, and 2. the entry parsing was more complicated than useful. This functionality may be better implemented as a proper filesystem read script, complete with directory tree dumping. - Changed test.py's --gdb argument to take [init, main, assert], this matches the names of the stages in C's startup. - Added printing of tail to all mdir dumps in readtree/readmdir. - Added a print for if any mdirs are corrupted in readtree. - Added debug script side-effects to .gitignore.	2020-03-29 21:19:33 -05:00
Derek Thrasher	f17d3d7eba	Minor cleanup - Removed the declaration of lfs_alloc_ack - Consistent brackets	2020-03-29 14:12:30 -05:00
Derek Thrasher	5e5b5d8572	(chore) updates from PR, we decided not to move forward with changing v1 code since it can be risky. Let's improve the future! Also renamed and moved around a the lookahead free / reset function	2020-03-29 14:12:30 -05:00
Derek Thrasher	d498b9fb31	(bugfix) adding line function to clear out all the global 'free' information so that we can reset it after a failed traversal	2020-03-29 14:12:30 -05:00
Christopher Haster	4677421aba	Added "evil" tests and detecion/recovery from bad pointers and infinite loops These two features have been much requested by users, and have even had several PRs proposed to fix these in several cases. Before this, these error conditions usually were caught by internal asserts, however asserts prevented users from implementing their own workarounds. It's taken me a while to provide/accept a useful recovery mechanism (returning LFS_ERR_CORRUPT instead of asserting) because my original thinking was that these error conditions only occur due to bugs in the filesystem, and these bugs should be fixed properly. While I still think this is mostly true, the point has been made clear that being able to recover from these conditions is definitely worth the code cost. Hopefully this new behaviour helps the longevity of devices even if the storage code fails. Another, less important, reason I didn't want to accept fixes for these situations was the lack of tests that prove the code's value. This has been fixed with the new testing framework thanks to the additional of "internal tests" which can call C static functions and really take advantage of the internal information of the filesystem.	2020-03-20 09:26:07 -05:00
WeiXiong Liao	64f70f51b0	lfs_bd_cmp() compares more bytes at one time It's very slowly to compare one byte at one time. Here are the performance I get from 128M spinand with NFTL by sequential writing. \| file size \| buffer size \| write speed \| \| 10 MB \| 0 B \| 3206.01 KB/s \| \| 10 MB \| 1 B \| 2434.04 KB/s \| \| 10 MB \| 2 B \| 2685.78 KB/s \| \| 10 MB \| 4 B \| 2857.94 KB/s \| \| 10 MB \| 8 B \| 3060.68 KB/s \| \| 10 MB \| 16 B \| 3155.30 KB/s \| \| 10 MB \| 64 B \| 3193.68 KB/s \| \| 10 MB \| 128 B \| 3230.62 KB/s \| \| 10 MB \| 256 B \| 3153.03 KB/s \| \| 70 MB \| 0 B \| 2258.87 KB/s \| \| 70 MB \| 1 B \| 1827.83 KB/s \| \| 70 MB \| 2 B \| 1962.29 KB/s \| \| 70 MB \| 4 B \| 2074.01 KB/s \| \| 70 MB \| 8 B \| 2147.03 KB/s \| \| 70 MB \| 64 B \| 2179.92 KB/s \| \| 70 MB \| 256 B \| 2179.96 KB/s \| The 0 Byte size means no validation and the 1 Byte size is how littlefs do before. Based on the above table and to save memory, comparing 8 bytes at one time is more wonderful. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com>	2020-03-13 15:23:20 +08:00
Chris Desjardins	cb26157880	Change assert to runtime check. I had a system that was constantly hitting this assert, after making this change it recovered immediately.	2020-02-23 22:18:08 -06:00
Christopher Haster	d04b077506	Fixed minor things to get CI passing again - Added caching to Travis install dirs, because otherwise pip3 install fails randomly - Increased size of littlefs-fuse disk because test script has a larger footprint now - Skip a couple of reentrant tests under byte-level writes because the tests just take too long and cause Travis to bail due to no output for 10m - Fixed various Valgrind errors - Suppressed uninit checks for tests where LFS_BLOCK_ERASE_VALUE == -1. In this case rambd goes uninitialized, which is fine for rambd's purposes. Note I couldn't figure out how to limit this suppression to only the malloc in rambd, this doesn't seem possible with Valgrind. - Fixed memory leaks in exhaustion tests - Fixed off-by-1 string null-terminator issue in paths tests - Fixed lfs_file_sync issue caused by revealed by fixing memory leaks in exhaustion tests. Getting ENOSPC during a file write puts the file in a bad state where littlefs doesn't know how to write it out safely. In this case, lfs_file_sync and lfs_file_close return 0 without writing out state so that device-side resources can still be cleaned up. To recover from ENOSPC, the file needs to be reopened and the writes recreated. Not sure if there is a better way to handle this. - Added some quality-of-life improvements to Valgrind testing - Fit Valgrind messages into truncated output when not in verbose mode - Turned on origin tracking	2020-02-18 18:05:03 -06:00
Christopher Haster	dcae185a00	Fixed typo in LFS_MKTAG_IF_ELSE	2020-02-12 11:31:34 -06:00
Christopher Haster	b69cf890e6	Fixed CRC check when prog_size causes multiple CRCs per commit This is a bit of a strange case that can be caused by storage with very large prog sizes, such as NAND flash. We only have 10 bits to store the size of our padding, so when the prog_size gets larger than 1024 bytes, we have to use multiple padding tags to commit to the next prog_size boundary. This causes some complication for the new logic that checks CRCs in case our block becomes "readonly" and contains existing commits that just happen to match our new commit size. Here we just check the CRC of the first commit. This isn't perfect but does protect against pure "readonly" blocks.	2020-02-09 22:43:20 -06:00
Christopher Haster	02c84ac5f4	Cleaned up dependent fixes on branch These should probably have been cleaned up in each commit to allow cherry-picking, but due to time I haven't been able to. - Went with creating an mdir copy in lfs_dir_commit. This handles a number of related cleanup issues in lfs_dir_compact and it does so more robustly. As a plus we can use the copy to update dependencies in the mlist. - Eliminated code left by the ENOSPC file outlining - Cleaned up TODOs and lingering comments - Changed the reentrant many directory create/rename/remove test to use a smaller set of directories because of space issues when READ/PROG_SIZE=512	2020-02-09 12:37:39 -06:00
Christopher Haster	6530cb3a61	Fixed lfs_fs_size doubling metadata-pairs This was caused by the previous fix for allocations during lfs_fs_deorphan in this branch. To catch half-orphans during block allocations we needed to duplicate all metadata-pairs reported to lfs_fs_traverse. Unfortunately this causes lfs_fs_size to report 2x the number of metadata-pairs, which would undoubtably confuse users. The fix here is inelegantly simple, just do a different traversale for allocations and size measurements. It reuses the same code but touches slightly different sets of blocks. Unfortunately, this causes the public lfs_fs_traverse and lfs_fs_size functions to split in how they report blocks. This is technically allowed, since lfs_fs_traverse may report blocks multiple times due to CoW behavior, however it's undesirable and I'm sure there will be some confusion. But I don't have a better solution, so from this point lfs_fs_traverse will be reporting 2x metadata-blocks and shouldn't be used for finding the number of available blocks on the filesystem.	2020-02-09 12:00:23 -06:00
Christopher Haster	fe957de892	Fixed broken wear-leveling when block_cycles = 2n-1 This was an interesting issue found during a GitHub discussion with rmollway and thrasher8390. Blocks in the metadata-pair are relocated every "block_cycles", or, more mathy, when rev % block_cycles == 0 as long as rev += 1 every block write. But there's a problem, rev isn't += 1 every block write. There are two blocks in a metadata-pair, so looking at it from each blocks perspective, rev += 2 every block write. This leads to a sort of aliasing issue, where, if block_cycles is divisible by 2, one block in the metadata-pair is always relocated, and the other block is _never_ relocated. Causing a complete failure of block-level wear-leveling. Fortunately, because of a previous workaround to avoid block_cycles = 1 (since this will cause the relocation algorithm to never terminate), the actual math is rev % (block_cycles+1) == 0. This means the bug only shows its head in the much less likely case where block_cycles is a multiple of 2 plus 1, or, in more mathy terms, block_cycles = 2n+1 for some n. To workaround this we can bitwise or our block_cycles with 1 to force it to never be a multiple of 2n. (Maybe we should do this during initialization? But then block_cycles would need to be mutable.) --- There's a few unrelated changes mixed into this commit that shouldn't be there since I added this as part of a branch of bug fixes I'm putting together rather hastily, so unfortunately this is not easily cherry-pickable.	2020-02-09 12:00:23 -06:00
Christopher Haster	f9c2fd93f2	Removed file outlining on ENOSPC in lfs_file_sync This was initially added as protection against the case where a file grew to no longer fit in a metadata-pair. While in most cases this should be caught by the math in lfs_file_write, it doesn't handle a problem that can happen if the files metadata is large enough that even small inline files can't fit. This can happen if you combine a small block size with large file names and many custom attributes. But trying to outline on ENOSPC creates creates a lot of problems. If we are actually low on space, this is one of the worst things we can do. Inline files take up less space than CTZ skip-lists, but inline files are rendered useless if we outline inline files as soon as we run low on space. On top of this, the outlining logic tries multiple mdir commits if it gets ENOSPC, which can hide errors if ENOSPC is returned for other reasons. In a perfect world, we would be using a different error code for no-room-in-metadata-pair, and no-blocks-on-disk. For now I've removed the outlining logic and we will need to figure out how to handle this situation more robustly.	2020-02-09 12:00:23 -06:00
Christopher Haster	77e3078b9f	Added/fixed tests for noop writes (where bd error can't be trusted) It's interesting how many ways block devices can show failed writes: 1. prog can error 2. erase can error 3. read can error after writing (ECC failure) 4. prog doesn't error but doesn't write the data correctly 5. erase doesn't error but doesn't erase correctly Can read fail without an error? Yes, though this appears the same as prog and erase failing. These weren't all simulated by testbd since I unintentionally assumed the block device could always error. Fixed by added additional bad-black behaviors to testbd. Note: This also includes a small fix where we can miss bad writes if the underlying block device contains a valid commit with the exact same size in the exact same offset.	2020-02-09 12:00:22 -06:00
Christopher Haster	517d3414c5	Fixed more bugs, mostly related to ENOSPC on different geometries Fixes: - Fixed reproducability issue when we can't read a directory revision - Fixed incorrect erase assumption if lfs_dir_fetch exceeds block size - Fixed cleanup issue caused by lfs_fs_relocate failing when trying to outline a file in lfs_file_sync - Fixed cleanup issue if we run out of space while extending a CTZ skip-list - Fixed missing half-orphans when allocating blocks during lfs_fs_deorphan Also: - Added cycle-detection to readtree.py - Allowed pseudo-C expressions in test conditions (and it's beautifully hacky, see line 187 of test.py) - Better handling of ctrl-C during test runs - Added build-only mode to test.py - Limited stdout of test failures to 5 lines unless in verbose mode Explanation of fixes below 1. Fixed reproducability issue when we can't read a directory revision An interesting subtlety of the block-device layer is that the block-device is allowed to return LFS_ERR_CORRUPT on reads to untouched blocks. This can easily happen if a user is using ECC or some sort of CMAC on their blocks. Normally we never run into this, except for the optimization around directory revisions where we use uninitialized data to start our revision count. We correctly handle this case by ignoring whats on disk if the read fails, but end up using unitialized RAM instead. This is not an issue for normal use, though it can lead to a small information leak. However it creates a big problem for reproducability, which is very helpful for debugging. I ended up running into a case where the RAM values for the revision count was different, causing two identical runs to wear-level at different times, leading to one version running out of space before a bug occured because it expanded the superblock early. 2. Fixed incorrect erase assumption if lfs_dir_fetch exceeds block size This could be caused if the previous tag was a valid commit and we lost power causing a partially written tag as the start of a new commit. Fortunately we already have a separate condition for exceeding the block size, so we can force that case to always treat the mdir as unerased. 3. Fixed cleanup issue caused by lfs_fs_relocate failing when trying to outline a file in lfs_file_sync Most operations involving metadata-pairs treat the mdir struct as entirely temporary and throw it out if any error occurs. Except for lfs_file_sync since the mdir is also a part of the file struct. This is relevant because of a cleanup issue in lfs_dir_compact that usually doesn't have side-effects. The issue is that lfs_fs_relocate can fail. It needs to allocate new blocks to relocate to, and as the disk reaches its end of life, it can fail with ENOSPC quite often. If lfs_fs_relocate fails, the containing lfs_dir_compact would return immediately without restoring the previous state of the mdir. If a new commit comes in on the same mdir, the old state left there could corrupt the filesystem. It's interesting to note this is forced to happen in lfs_file_sync, since it always tries to outline the file if it gets ENOSPC (ENOSPC can mean both no blocks to allocate and that the mdir is full). I'm not actually sure this bit of code is necessary anymore, we may be able to remove it. 4. Fixed cleanup issue if we run out of space while extending a CTZ skip-list The actually CTZ skip-list logic itself hasn't been touched in more than a year at this point, so I was surprised to find a bug here. But it turns out the CTZ skip-list could be put in an invalid state if we run out of space while trying to extend the skip-list. This only becomes a problem if we keep the file open, clean up some space elsewhere, and then continue to write to the open file without modifying it. Fortunately an easy fix. 5. Fixed missing half-orphans when allocating blocks during lfs_fs_deorphan This was a really interesting bug. Normally, we don't have to worry about allocations, since we force consistency before we are allowed to allocate blocks. But what about the deorphan operation itself? Don't we need to allocate blocks if we relocate while deorphaning? It turns out the deorphan operation can lead to allocating blocks while there's still orphans and half-orphans on the threaded linked-list. Orphans aren't an issue, but half-orphans may contain references to blocks in the outdated half, which doesn't get scanned during the normal allocation pass. Fortunately we already fetch directory entries to check CTZ lists, so we can also check half-orphans here. However this causes lfs_fs_traverse to duplicate all metadata-pairs, not sure what to do about this yet.	2020-02-09 11:54:22 -06:00
Christopher Haster	a5d614fbfb	Added tests for power-cycled-relocations and fixed the bugs that fell out The power-cycled-relocation test with random renames has been the most aggressive test applied to littlefs so far, with: - Random nested directory creation - Random nested directory removal - Random nested directory renames (this could make the threaded linked-list very interesting) - Relocating blocks every write (maximum wear-leveling) - Incrementally cycling power every write Also added a couple other tests to test_orphans and test_relocations. The good news is the added testing worked well, it found quite a number of complex and subtle bugs that have been difficult to find. 1. It's actually possible for our parent to be relocated and go out of sync in lfs_mkdir. This can happen if our predecessor's predecessor is our parent as we are threading ourselves into the filesystem's threaded list. (note this doesn't happen if our predecessor _is_ our parent, as we then update our parent in a single commit). This is annoying because it only happens if our parent is a long (>1 pair) directory, otherwise we wouldn't need to catch relocations. Fortunately we can reuse the internal open file/dir linked-list to catch relocations easily, as long as we're careful to unhook our parent whenever lfs_mkdir returns. 2. Even more surprising, it's possible for the child in lfs_remove to be relocated while we delete the entry from our parent. This can happen if we are our own parent's predecessor, since we need to be updated then if our parent relocates. Fortunately we can also hook into the open linked-list here. Note this same issue was present in lfs_rename. Fortunately, this means now all fetched dirs are hooked into the open linked-list if they are needed across a commit. This means we shouldn't need assumptions about tree movement for correctness. 3. lfs_rename("deja/vu", "deja/vu") with the same source and destination was broken and tried to delete the entry twice. 4. Managing gstate deltas when we lose power during relocations was broken. And unfortunately complicated. The issue happens when we lose power during a relocation while removing a directory. When we remove a directory, we need to move the contents of its gstate delta to another directory or we'll corrupt littlefs gstate. (gstate is an xor of all deltas on the filesystem). We used to just xor the gstate into our parent's gstate, however this isn't correct. The gstate isn't built out of the directory tree, but rather out of the threaded linked-list (which exists to make collecting this gstate efficient). Because we have to remove our dir in two operations, there's a point were both the updated parent and child can exist in threaded linked-list and duplicate the child's gstate delta. .--------. ->\| parent \|-. \| gstate \| \| .-\| a \|-' \| '--------' \| X <- child is orphaned \| .--------. '>\| child \|-> \| gstate \| \| a \| '--------' What we need to do is save our child's gstate and only give it to our predecessor, since this finalizes the removal of the child. However we still need to make valid updates to the gstate to mark that we've created an orphan when we start removing the child. This led to a small rework of how the gstate is handled. Now we have a separation of the gpending state that should be written out ASAP and the gdelta state that is collected from orphans awaiting deletion. 5. lfs_deorphan wasn't actually able to handle deorphaning/desyncing more than one orphan after a power-cycle. Having more than one orphan is very rare, but of course very possible. Fortunately this was just a mistake with using a break the in the deorphan, perhaps left from v1 where multiple orphans weren't possible? Note that we use a continue to force a refetch of the orphaned block. This is needed in the case of a half-orphan, since the fetched half-orphan may have an outdated tail pointer.	2020-01-26 23:45:54 -06:00

1 2 3 4 5 ...

352 Commits