Commit Graph

247 Commits

Author SHA1 Message Date
Christopher Haster
4db96d4d44 Changed unwritable superblock to ENOSPC for consistency
While ECORRUPT is not a wrong error code, it doesn't match other
instances of hitting a corrupt block during write. During writes, if
blocks are detected as corrupt their data is evicted and moved to a new
clean block. This means that at the end of a disk's lifetime, exhaustion
errors will be reported as ENOSPC when littlefs can't find any new block
to store the data.

This has the benefit of matching behaviour when a new file is written
and no more blocks can be found, due to either a small disk or corrupted
blocks on disk. To littlefs it's like the disk shrinks in size over
time.
2018-10-18 10:00:48 -05:00
Christopher Haster
a2532a34cd Fixed inline files when inline_max == cache_size
The initial implementation of inline files was thrown together fairly
quicky, however it has worked well so far and there hasn't been much
reason to change it.

One shortcut was to trick file writes into thinking they are writing to
imaginary blocks. This works well and reuses most of the file code
paths, as long as we don't flush the imaginary block out to disk.

Initially we did this by limiting inline_max to cache_max-1, ensuring
that the cache never fills up and gets flushed. This was a rather dirty
hack, the better solution, implemented here, is to handle the
representation of an "imaginary" block correctly all the way down into
the cache layer.

So now for files specifically, the value -1 represents a null pointer,
and the value -2 represents an "imaginary" block. This may become a
problem if the number of blocks approaches the max, however this -2
value is never written to disk and can be changed in the future without
breaking compatibility.
2018-10-18 10:00:48 -05:00
Christopher Haster
d5e800575d Collapsed recursive deorphans into a single pass
Because a block can go bad at any time, if we're unlucky, we may end up
generating multiple orphans in a single metadata write. This is
exacerbated by the early eviction in dynamic wear-leveling.

We can't track _all_ orphans, because that would require unbounded
storage and significantly complicate things, but there are a handful of
intentional orphans we do track because they are easy to resolve without
the O(n^2) deorphan scan. These are anytime we intentionally remove a
metadata-pair.

Initially we cleaned up orphans as they occur with whatever knowledge we
do have, and just accepted the extra O(n^2) deorphan scans in the
unlucky case. However we can do a bit better by being lazy and leaving
deorphaning up to the next metadata write. This needs to work with the known
orphans while still setting the orphan flag on disk correctly. To
accomplish this we replace the internal flag with a small counter.

Note, this means that our internal representation of orphans differs
from what's on disk. This is annoying but not the end of the world.
2018-10-18 10:00:48 -05:00
Christopher Haster
21217d75ad Dropped lfs_fs_getattr for the more implicit lfs_getattr("/")
This was a pretty simple oversight on my part. Conceptually, there's no
difference between lfs_fs_getattr and lfs_getattr("/"). Any operations
on directories can be applied "globally" by referring to the root
directory.

Implementation wise, this actually fixes the "corner case" of storing
attributes on the root directory, which is broken since the root
directory doesn't have a related entry. Instead we need to use the root
superblock for this purpose.

Fewer functions means less code to document and maintain, so this is a
nice benefit. Now we just have a single lfs_getattr/setattr/removeattr set
of functions along with the ability to access attributes atomically in
lfs_file_opencfg.
2018-10-18 10:00:48 -05:00
Christopher Haster
38011f4cd0 Fixed minor memory leak
- Fixed memory leak
- Change lfs_globals_zero to use memset as this
  made leak checking more effective
- Checked for leaks with valgrind
2018-10-18 10:00:48 -05:00
Christopher Haster
126ef8b07f Added allocation randomization for dynamic wear-leveling
This implements the second step of full dynamic wear-leveling, block
allocation randomization. This is the key part the uniformly distributes
wear across the filesystem, even through reboots.

The entropy actually comes from the filesystem itself, by xoring
together all of the CRCs in the metadata-pairs on the filesystem. While
this sounds like a ridiculous operation, it's easy to do when we already
scan the metadata-pairs at mount time.

This gives us a random number we can use for block allocation.
Unfortunately it's not a great general purpose random generator as the
output only changes every filesystem write. Fortunately that's exactly
when we need our allocator.

---

Additionally, the randomization created a mess for the testing
framework. Fortunately, this method of randomization is deterministic.
A very useful property for reproducing bugs.
2018-10-18 09:55:47 -05:00
Christopher Haster
e4a0d586d5 Added building blocks for dynamic wear-leveling
Initially, littlefs relied entirely on bad-block detection for
wear-leveling. Conceptually, at the end of a devices lifespan, all
blocks would be worn evenly, even if they weren't worn out at the same
time. However, this doesn't work for all devices, rather than causing
corruption during writes, wear reduces a devices "sticking power",
causing bits to flip over time. This means for many devices, true
wear-leveling (dynamic or static) is required.

Fortunately, way back at the beginning, littlefs was designed to do full
dynamic wear-leveling, only dropping it when making the retrospectively
short-sighted realization that bad-block detection is theoretically
sufficient. We can enable dynamic wear-leveling with only a few tweaks
to littlefs. These can be implemented without breaking backwards
compatibility.

1. Evict metadata-pairs after a certain number of writes. Eviction in
   this case is identical to a relocation to recover from a bad block.
   We move our data and stick the old block back into our pool of
   blocks.

   For knowing when to evict, we already have a revision count for each
   metadata-pair which gives us enough information. We add the
   configuration option block_cycles and evict when our revision count
   is a multiple of this value.

2. Now all blocks participate in COW behaviour. However we don't store
   the state of our allocator, so every boot cycle we reuse the first
   blocks on storage. This is very bad on a microcontroller, where we
   may reboot often. We need a way to spread our usage across the disk.

   To pull this off, we can simply randomize which block we start our
   allocator at. But we need a random number generator that is different
   on each boot. Fortunately we have a great source of entropy, our
   filesystem. So we seed our block allocator with a simple hash of the
   CRCs on our metadata-pairs. This can be done for free since we
   already need to scan the metadata-pairs during mount.

What we end up with is a uniform distribution of wear on storage. The
wear is not perfect, if a block is used for metadata it gets more wear,
and the randomization may not be exact. But we can never actually get
perfect wear-leveling, since we're already resigned to dynamic
wear-leveling at the file level.

With the addition of metadata logging, we end up with a really
interesting two-stage wear-leveling algorithm. At the low-level,
metadata is statically wear-leveled. At the high-level, blocks are
dynamically wear-leveled.

---

This specific commit implements the first step, eviction of metadata
pairs. Entertwining this into the already complicated compact logic was
a bit annoying, however we can combine the logic for superblock
expansion with the logic for metadata-pair eviction.
2018-10-18 09:30:45 -05:00
Christopher Haster
20b669a23d Fixed issue with big-endian CTZ lists intertwined in commit logic
Found while testing big-endian support. Basically, if littlefs is really
really unlucky, the block allocator could kick in while committing a
file's CTZ reference. If this happens, the block allocator will need to
traverse all CTZ skip-lists in memory, including the skip-list we're
committing. This means we can't convert the CTZ's endianness in place,
and need to make a copy on big-endian systems.

We rely on dead-code elimination from the compiler to make the
conditional behaviour for big-endian vs little-endian system a noop
determined by the lfs_tole32 intrinsic.
2018-10-16 20:53:25 -05:00
Christopher Haster
10f45ac02f Changed lfs_crc to match more common API
In looking at the common CRC APIs out there, this seemed the most
common. At least more common than the current modified-in-place pointer
API. It also seems to have a slightly better code footprint. I'm blaming
pointer optimization issues.

One downside is that lfs_crc can't report errors, however it was already
assumed that lfs_crc can not error.
2018-10-16 20:53:19 -05:00
Christopher Haster
3b3981eb74 Fixed testing issues introduced by expanding superblocks
This was mostly tweaking test cases to be accommodating for variable
sized superblock-lists. Though there were a few bugs that needed fixing:
- Changed compact to use source dir for move since the original dir
  could have changed as a result of an expand.
- Created copy of current directory so we don't overwrite ourselves
  during an internal commit update.

Also made sure all of the test suites provide reproducable results when
ran independently (the entry tests were behaving differently based on
which tests were ran before).

(Some where legitimate test failures)
2018-10-16 20:18:24 -05:00
Christopher Haster
7c70068b89 Added root entry and expanding superblocks
Expanding superblocks has been on my wishlist for a while. The basic
idea is that instead of maintaining a fixed offset blocks {0, 1} to the
the root directory (1 pointer), we maintain a dynamically sized
linked-list of superblocks that point to the actual root. If the number
of writes to the root exceeds some value, we increase the size of the
superblock linked-list.

This can leverage existing metadata-pair operations. The revision count for
metadata-pairs provides some knowledge on how much wear we've put on the
superblock, and the threaded linked-list can also be reused for this
purpose. This means superblock expansion is both optional and cheap to
implement.

Expanding superblocks helps both extremely small and extremely large filesystem
(extreme being relative of course). On the small end, we can actually
collapse the superblock into the root directory and drop the hard requirement
of 4-blocks for the superblock. On the large end, our superblock will
now last longer than the rest of the filesystem. Each time we expand,
the number of cycles until the superblock dies is increased by a power.

Before we were stuck with this layout:
level  cycles  limit    layout
1      E^2     390 MiB  s0 -> root

Now we expand every time a fixed offset is exceeded:
level  cycles  limit    layout
0      E       4 KiB    s0+root
1      E^2     390 MiB  s0 -> root
2      E^3     37 TiB   s0 -> s1 -> root
3      E^4     3.6 EiB  s0 -> s1 -> s2 -> root
...

Where the cycles are the number of cycles before death, and the limit is
the worst-case size a filesystem where early superblock death becomes a
concern (all writes to root using this formula: E^|s| = E*B, E = erase
cycles = 100000, B = block count, assuming 4096 byte blocks).

Note we can also store copies of the superblock entry on the expanded
superblocks. This may help filesystem recover tools in the future.
2018-10-16 19:30:56 -05:00
Christopher Haster
c3e36bd2a7 Standardized naming for internal functions
- lfs_pairblah -> lfs_pair_blah
- lfs_ctzblah -> lfs_ctz_blah
- lfs_tagblah -> lfs_tag_blah
- lfs_globalblah -> lfs_global_blah
- lfs_commitblah -> lfs_commit_blah
2018-10-16 11:35:39 -05:00
Christopher Haster
6d0a6fc462 Merge remote-tracking branch 'origin/master' into v2-alpha 2018-10-16 11:33:00 -05:00
Christopher Haster
dbcbe4e088 Changed name of upper-limits from blah_size to blah_max
This standardizes the naming between the LFS_BLAH_MAX macros and the
blah_max configuration in the lfs_config structure.
2018-10-16 09:42:46 -05:00
Christopher Haster
a88230ae6a Updated custom attribute documentation and tweaked nonexistant attributes
Because of limitations in how littlefs manages attributes on disk,
littlefs views zero-length attributes and missing attributes as the same
thing. The simpliest implementation of attributes mirrors this behaviour
transparently for the user.
2018-10-16 09:20:44 -05:00
Christopher Haster
f369f80540 Added tests for global state stealing
State stealing is a tricky part of managing the xored-globals. When
removing a metadata-pair from the metadata chain, whichever
metadata-pair does the removing is also responsible for stealing the
removed metadata-pair's global delta and incorporating it into it's own
global delta. Otherwise the global state would become corrupted.
2018-10-16 09:18:18 -05:00
Christopher Haster
1941bbda76 Cleaned up config options
- Updated documentation where needed
- Added asserts which take into account relationships with the new
  cache_size configuration
- Restructured ordering to be consistent for the three main
  configurables: LFS_ATTR_MAX, LFS_NAME_MAX, and LFS_INLINE_MAX
2018-10-16 09:07:22 -05:00
Christopher Haster
3cfa08602a Introduced cache_size as alternative to hardware read/write sizes
The introduction of an explicit cache_size configuration allows
customization of the cache buffers independently from the hardware
read/write sizes.

This has been one of littlefs's main handicaps. Without a distinction
between cache units and hardware limitations, littlefs isn't able to
read or program _less_ than the cache size. This leads to the
counter-intuitive case where larger cache sizes can actually be harmful,
since larger read/prog sizes require sending more data over the bus if
we're only accessing a small set of data (for example the CTZ skip-list
traversal).

This is compounded with metadata logging, since a large program size
limits the number of commits we can write out in a single metadata
block. It really doesn't make sense to link program size + cache
size here.

With a separate cache_size configuration, we can be much smarter about
what we actually read/write from disk.

This also simplifies cache handling a bit. Before there were two
possible cache sizes, but these were rarely used. Note that the
cache_size is NOT written to the superblock and can be freely changed
without breaking backwards compatibility.
2018-10-16 08:32:01 -05:00
Christopher Haster
97f35c3e05 Simplified the internal xored-globals implementation
There wasn't much use (and inconsistent compiler support) for storing
small values next to the unaligned lfs_global_t struct. So instead, I've
rounded the struct up to the nearest word to try to take advantage of
the alignment in xor and memset operations.

I've also moved the global fetching into lfs_mount, since that was the
only use of the operation. This allows for some variable reuse in the
mount function.
2018-10-16 08:28:14 -05:00
Christopher Haster
35f68d28cc Squished in-flight files/dirs into single list
This is an effort to try to consolidate the handling of in-flight files
and dirs opened by the user (and possibly opened internally). Both files
and dirs have metadata state that need to be kept in sync by the commit
logic.

This metadata state is mostly contained in the lfs_mdir_t type, which is
present in both the lfs_file_t and lfs_dir_t. Unfortunately both of
these structs have some relatively unrelated metadata that needs to be
kept in sync:
- Files store an id representing the open file
- Dirs store an id during iteration

While these take up the same space, they unfortunately need to be
managed differently by the commit logic.

The best solution I can come up with is to simple store a general
purpose list and tag both structures with LFS_TYPE_REG and LFS_TYPE_DIR
respectively. This is kinda funky, but wins out over duplicated the
commit logic.
2018-10-16 08:18:21 -05:00
Christopher Haster
bd1e0c4059 Cleaned up several TODOs
Other than removed outdated TODOs, there are several tweaks:
- Standardized naming of fs-level functions (mostly internal names)
- Tweaked low-level use of subtype to hopefully take advantage of
  redundant code removal
- Moved root-handling into lfs_dir_getinfo
- Updated DEBUG statements around move/orphan fixes
- Removed trailing 1s in type fields
- Removed unused code
2018-10-16 08:12:27 -05:00
Christopher Haster
01d837e08d Removed redundant lfs_scan in lfs_init
Interestingly enough, lfs_scan is only needed during mount, as the state
of the filesystem is well know during format.
2018-10-16 08:05:17 -05:00
Christopher Haster
112fefc068 Added back big-endian support again on the new metadata structures
The only interesting thing to note is that we now have to also support
le16 due to storing the id outside of tags in the globals structure.
2018-10-16 08:03:30 -05:00
Christopher Haster
64df0a5e20 Added orphan bit to xored-globals
Unfortunately for us, even with the new ability to store global state,
orphans can not be handled as gracefully as moves. This is due to the
fact that directory operations can create an unbounded number of
orphans. It's usually small, the fact that it's unbounded means we can't
store the orphan info in xored-globals.

However, one thing we can do to leverage the xored-global state is store
a bit indicating if _any_ orphans are present. This means in the common
case we can completely avoid the deorphan step, while only using a
single bit of the global state, which is effectively free since we can
store it in the globals tag itself.

If a littlefs drive does not want to consider the orphan bit, it's free
to use the previous behaviour of always checking for orphans on first
write.
2018-10-16 07:48:15 -05:00
Christopher Haster
1a58ba799c Fixed ENOSPC issues with zero-granularity blocks
Result of testing on zero-granularity blocks, where the prog size and
read size equals the block size. This represents SD cards and other
traditional forms of block storage where we don't really get a benefit
from the metadata logging.

Unfortunately, since updates in both are tested by the same script,
we can't really use simple bash commands. Added a more complex
script to simulate corruption. Fortunately this should be more robust
than the previous solutions.

The main fixes were around corner cases where the commit logic fell
apart when it didn't have room to complete commits, but these were
fixable in the current design.
2018-10-16 07:41:56 -05:00
Christopher Haster
105907ba66 Cleaned up config usage in file logic
The main change here was to drop the in-place twiddling of custom
attributes to match the internal attribute structures. The original
thought was that this could allow the compiler to garbage collect more
of the custom attribute logic when not used, but since this occurs in
the common lfs_file_opencfg function, gc can't really happen.

Not twiddling the user's structure is the polite thing to do, opens up
the ability to store the lfs_attr structure in ROM, and avoids surprising
the user if they attempt to use the structure for their own purposes.

This means we can make the lfs_attr structure const and rely on the list
in the lfs_file_config structure, similar to how we rely on the global
lfs_config structure.

Some other tweaks:
- Dropped the global file_buffer, replaced entirely by per-file buffers.
- Updated LFS_INLINE_MAX and LFS_ATTR_MAX to correct values
- Added workaround for compiler bug related to zero initializer:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119
2018-10-16 07:31:24 -05:00
Christopher Haster
df1b607351 Removed the implicit lfs_t parameter to lfs_traverse
This is a very minor thing but it has been bugging me. On one hand, all
a callback ever needs is a single pointer for context. On the other
hand, you could make the argument that in the context of littlefs, the
lfs_t struct represents global state and should always be available to
callbacks passed to littlefs.

In the end I'm sticking with only a single context pointer, since this
is satisfies the minimum requirements and has the highest chance of
function reuse. If a user needs access to the lfs_t struct, it can be
passed by reference in the context provided to the callback.

This also matches callbacks used in other languages with more emphasis
on objects and classes. Usually the callback doesn't get a reference to
the caller.
2018-10-16 07:25:22 -05:00
Christopher Haster
225706044e Fixed test bugs around handling corruption
The main thing to consider was how lfs_dir_fetchwith reacts to
corruption it finds and to make sure falling back to old values works
correctly.

Some of the tricky bits involved making sure we could fall back to both old
commits and old metadata blocks while still handling things like
synthetic moves correctly.
2018-10-16 07:15:59 -05:00
Christopher Haster
3e246da52c Fixed the orphan test to handle logging metadata-pairs
The main issue here was that the old orphan test relied on deleting the
block that contained the most recent update. In the new design this
doesn't really work since updates get appended to metadata-pairs
incrementally.

This is fixed by instead using the truncate command on the appropriate
block. We're now passing orphan tests.
2018-10-16 07:04:44 -05:00
Christopher Haster
15d156082c Added support for custom attributes leveraging the new metadata logging
Now that littlefs has been rebuilt almost from the ground up with the
intention to support custom attributes, adding in custom attribute
support is relatively easy.

The highest bit in the 9-bit type structure indicates that an attribute
is a user-specified custom attribute. The user then has a full 8-bits to
specify the attribute type. Other than that, custom attributes are
treated the same as system-level attributes.

Also made some tweaks to custom attributes:
- Adopted the opencfg for file-level attributes provided by dpgeorge
- Changed setattrs/getattrs to the simpler setattr/getattr functions
  users will probably be more familiar with. Note that multiple
  attributes can still be committed atomically with files, though not
  with directories.
- Changed LFS_ATTRS_MAX -> LFS_ATTR_MAX since there's no longer a global
  limit on the sum of attribute sizes, which was rather confusing.
  Though they are still limited by what can fit in a metadata-pair.
2018-10-16 06:41:59 -05:00
Christopher Haster
3914cdf39f Pulled in fixes for additional path corner cases
Pulled in 015b86b. Merging this now avoids duplicate effort restructuring
the path lookup logic.
2018-10-16 06:36:00 -05:00
Christopher Haster
392b2ac79f Refactored the updates of in-flight files/dirs
Updated to account for changes as a result of commits/compacts. And
changed instances of iteration over both files and dirs to use a single
nested loop.

This does rely implicitly on the structure layout of dirs/files and
their location in lfs_t, which isn't great. But it gets the job done
with less code duplication.
2018-10-16 06:00:18 -05:00
Christopher Haster
d9a24d0a2b Fixed move handling when caught in a relocate
This was a surprisingly tricky issue. One of the subtle requirements for
the new move handling to work is that the block containing the move does
not change until the move is resolved. Initially, this seemed easy to
implement, given that a move is always immediately followed by its
resolution.

However, the extra metadata-pair operations needed to maintain integrity
present a challenge. At any commit, a directory block may end up moved
as a side effect of relocation due to a bad block.

The fix here is to move the move resolution directly into the commit
logic. This means that any commit to a block containing a move will be
implicitly resolved, leaving the later attempt at move resolution as a
noop.

This fix required quite a bit of restructuring, but as a nice
side-effect some of the complexity around moves actually went away.
Additionally, the new move handling is surprisingly powerful at
combining moves with nearby commits. And we now get same-metadata-pair
renames for free! A win for procrasination on that minor feature.
2018-10-16 05:28:00 -05:00
Christopher Haster
5d24e656f1 Cleaned up commit logic and function organization
Restrctured function organization to make a bit more sense, and made
some small refactoring tweaks, specifically around the commit logic and
global related functions.
2018-10-16 05:19:38 -05:00
Christopher Haster
d3f3711560 Cleaned up attributes and related logic
The biggest change here is to make littlefs less obsessed with the
lfs_mattr_t struct. It was limiting our flexibility and can be entirely
replaced by passing the tag + data explicitly. The remaining use of
lfs_mattr_t is specific to the commit logic, where it replaces the
lfs_mattrlist_t struct.

Other changes:
- Added global lfs_diskoff struct for embedding disk references inside
  the lfs_mattr_t.
- Reordered lfs_mattrlist_t to squeeze out some code savings
- Added commit_get for explicit access to entries from unfinished
  metadata-pairs
- Parameterized the "stop_at_commit" flag instead of hackily storing it
  in the lfs_mdir_t temporarily
- Changed return value of lfs_pred to error-only with ENOENT representing
  a missing predecessor
- Adopted const where possible
2018-10-16 05:03:44 -05:00
Christopher Haster
5fc53bd726 Changed internal functions to return tags over pointers
One neat (if gimmicky) trick, is that each tag has a valid bit in the
highest bit position of the 32-bit word. This is used to determine when
to stop a fetch operation, but after fetch, the bit is free to use in
the driver. This means we can create a typed-union of sorts with error
codes and tags, returning both as the return value from a function.

Say what you will about this trick, it does have a significant impact on
code size. I suspect this is primarily due to the compiler having a hard
time optimizing around pointer access.
2018-10-16 04:50:35 -05:00
Christopher Haster
2b35c36b67 Renamed tag functions and macros
- lfs_tagverb -> lfs_tag_verb
- lfs_mktag -> LFS_MKTAG (it's a macro now)
- LFS_STRUCT_THING -> LFS_THINGSTRUCT
2018-10-16 04:47:20 -05:00
Christopher Haster
7c88bc96b6 Restructured get/traverse functions
While it makes sense to reuse as many code paths as possible, it turns
out that the logic behind the traversal of littlefs's metadata-pairs is
so simple that it's actually cheaper to duplicate the traversal code
where needed.

This means instead of the code path move -> traverse -> movescan -> get
-> traverse -> getscan, we can use the relatively flatter code path of
move -> get.
2018-10-16 04:40:32 -05:00
Christopher Haster
e3b867897a Modified results from find-like functions to use tags
Tags offer all of the necessary info from the find functions (which
makes sense, this is the structure that stores the info on disk).
Passing around a single tag instead of separate id and type fields
simplifies the internal functions while leverages the tag's compactness.
2018-10-16 04:37:23 -05:00
Christopher Haster
67d9f88b0e Combined get functions into one
Unfortunately, the three different sets of get functions were not
contributing very much, and having three different get functions means
we may be wasting code on redundant code paths.

By dropping the user of the lfs_mattr_t struct in favor of a buffer, we
can combine the three code paths with a bit of tweaking.
2018-10-16 04:34:07 -05:00
Christopher Haster
7ad9700d9e Integrated findscan into fetch as a built in side effect
Now that littlefs's fetchwith operations have stabilized a bit, there's
actually only a single fetchwith operation, the findscan function.
Given that there's no need for the internal functions to be a forward
compatible API, we can integrate the findscan behaviour directly into
fetchwith and avoid the (annoyingly) costly generalization overhead.

As an added benefit, we can easily add additional tag modifications
during fetch, such as the synthetic moves needed to resolve in-flight
move operations without disk modifications.
2018-10-16 04:25:24 -05:00
Christopher Haster
fe31f79b5f Consolidated find/parent scanning functions
An interesting observation about the find and parent scanning functions
is that at their core, they're both actually doing the same operation.
They search a metadata-pair during fetch for an entry uses the entry data
instead of the entry tag. This means we can combine these functions and
get a decent code savings.

It's a little bit trickier because pair ordering isn't guaranteed. But
to work around that we can simply search for both pair orderings. It's a
bit more expensive but may be worth the code savings. A fancier
implementation in the future can avoid the 2x lfs_parent scans.
2018-10-16 04:14:48 -05:00
Christopher Haster
fd121dc2e2 Dropped "has id" bit encoding in favor of invalid id
I've been trying to keep tag types organized with an encoding that hints
if a tag uses its id field for file ids. However this seem to have been
a mistake. Using a null id of 0x3ff greatly simplified quite a bit of
the logic around managing file related tags.

The downside is one less id we can use, but if we look at the encoding
cost, donating one full bit costs us 2^9 id permutations vs 1 id
permutation. So even if we had a perfect encoding it's in our favor to
use a null id. The cost of null ids is code size, but with the
complexity around figuring out if a type used it's id or not it just
works out better to use a null id.
2018-10-15 18:49:01 -05:00
Christopher Haster
b7bd34f461 Restructured types to use a more flexible bit encoding
Recall that the 32-bit tag structure contains a 9-bit type. The type
structure then decomposes into a bit more information:
[---   9   ---]
[1|- 4 -|- 4 -]
 ^   ^     ^- specific type
 |   \------- subtype
 \----------- user bit

The main change is an observation from moving type info to the name tag
from the struct tag. Since we don't need the type info in the struct
tag, we can significantly simplify the type structure.
2018-10-15 18:34:26 -05:00
Christopher Haster
c1103efb53 Changed type info to be retrieved from name tag instead of struct tag
Originally, I had type info encoded in the struct tag. This initially
made sense because the type info only directly impacts the struct tag.
However this was a case of focusing too much on the details instead of
the bigger picture.

A more file operations need to figure out the type of a file, but it's
only actually a small number of file operations that need to interact
with the file's structure. For the common case, providing the type of
the file early shortens operations by a full tag access.

Additionally, but storing the type in the file name tag, this opens up
the struct tag to use those bits for storing more struct descriptions.
2018-10-15 18:27:28 -05:00
Christopher Haster
d7b0652936 Removed old move logic, now passing move tests
The introduction of xored-globals required quite a bit of work to
integrate. But now that that is working, we can strip out the old move
logic.

It's worth noting that the xored-globals integration with commits is
relatively complex and subtle.
2018-10-15 16:03:18 -05:00
Christopher Haster
2ff32d2dfb Fixed bug where globals were poisoning move commits
The issue lies in the reuse of the id field for globals. Before globals,
the only tags with a non-null (0x3ff) id field were names, structs, and
other file-specific metadata. But globals are also using this field for
the indirect delete, since otherwise the globals structure would be very
unaligned (74-bits long).

To make matters worse, the id field for globals contains the delta used
to reconstruct the globals at mount time. Which means the id field could
take on very absurd values and break the dir fetch logic if we're not
careful.

Solution is to use the scope portion of the type field where necessary,
although unforunately this does add some code cost.
2018-10-15 15:56:04 -05:00
Christopher Haster
b46fcac585 Fixed issues with finding wrong ids after bad commits
Unfortunately, the behaviour needed of lfs_dir_fetchwith is as subtle as
it is important. When fetching from a block corrupted by power-loss,
lfs_dir_fetch must be able to rewind any state it picks up to before the
corruption. This is not limited to the directory state, but includes
find results and other side-effects.

This gets a bit complicated when trying to generalize littlefs's
fetchwith mechanics. Being able to scan a directory block during a fetch
greatly impacts the runtime of littlefs operations, but if the state is
generic how do we know what to rollback to?

The fix here is to leave the management of rolling back state to the
fetchwith match functions, and transparently pass a CRC tag to indicate
the temporary state can be saved.
2018-10-13 19:46:38 -05:00
Christopher Haster
cebf7aa0fe Switched back to simple deorphan-step on directory remove
Originally I tried to reuse the indirect delete to accomplish truely
atomic directory removes, however this fell apart when it came to
implementing directory removes as a side-effect of renames.

A single indirect-delete simply can't handle renames with removes as
a side effects. When copying an entry to its destination, we need to
atomically delete both the old entry, and the source of our copy. We
can't delete both with only a single indirect-delete. It is possible to
accomplish this with two indirect-deletes, but this is such an uncommon
case that it's really not worth supporting efficiently due to how
expensive globals are.

I also dropped indirect-deletes for normal directory removes. I may add
it back later, but at the moment it's extra code cost for that's not
traveled very often.

As a result, restructured the indirect delete handling to be a bit more
generic, now with a multipurpose lfs_globals_t struct instead of the
delete specific lfs_entry_t struct.

Also worked on integrating xored-globals, now with several primitive
global operations to manage fetching/updating globals on disk.
2018-10-13 19:35:45 -05:00
Christopher Haster
3ffcedb95b Restructured tags to better support xored-globals
32-bit tag structure:
[---        32       ---]
[1|- 9 -|- 10 -|-- 12 --]
 ^   ^     ^       ^- entry length
 |   |     \--------- file id
 |   \--------------- tag type
 \------------------- valid

In this tag, the type decomposes into some more information:
[---      9      ---]
[1|- 2 -|- 3 -|- 3 -]
 ^   ^     ^     ^- struct
 |   |     \------- type
 |   \------------- scope
 \----------------- user

The change in this encoding is the addition of a global scope:
LFS_SCOPE_STRUCT = 0 00 xxx xxx
LFS_SCOPE_ENTRY  = 0 01 xxx xxx
LFS_SCOPE_DIR    = 0 10 xxx xxx
LFS_SCOPE_FS     = 0 11 xxx xxx
LFS_SCOPE_USER   = 1 xx xxx xxx
2018-10-13 19:12:35 -05:00