- Callbacks for get/match, this does have a code cost, but allows more
code reuse, which almost balances out the code cost, but also reduces
maintenance and increased flexibility. Also callbacks may be able to
be gc-ed in some cases.
- Consistent struct vs _t usage, _t for external-facing struct that
shouldn't be messed with outside the library. structs for external and
internal structs where anyone with access is allowed to modify.
- Reorganized several high-level function groups
- Inlined structures that didn't need separate definitions in header
This follows from enabling tag deletion, however does require some
consideration with the APIs.
Now we can remove custom attributes, as well as determine if an attribute
exists or not.
littlefs has a mechanism for deleting file entries, but it doesn't have
a mechanism for deleting individual tags. This _is_ sufficient for a
filesystem, but limits our flexibility. Deleting attributes would be
useful in the custom attribute API and for future improvements (hint the
child pointers in B-trees).
However, deleteing attributes is tricky. We can't just omit the
attribute, since we can only add new tags. Additionally, we need a way
to track what attributes have been deleted during compaction, which
currently relies on writing out attributes to disk.
The solution here is pretty nifty. First we have to come up with a way
to represent a "deleted" attribute. Rather than adding an additional
bit to the already squished tag structure, we use a -1 length field,
specifically 0xfff. Now we can commit a delete attribute, and this
deleted tag acts as a place holder during compacts.
However our delete tag will never leave our metadata log. We need some
way to discard our delete tag if we know it's the only representation of
that tag on the metadata log. Ah! We know it's the only tag if it's in
the first commit on the metadata log. So we add an additional bit to the
CRC entry to indicate if we're on the first commit, and use that to
decide if we need to keep delete tags around.
Now we have working tag deletion.
Interestingly enough, tag deletion is actually indirectly more efficient
than entry deletion, since compacting entries requires multiple passes,
whereas tag deletion gets cleaned up lazily. However we can't adopt the
same strategy in entry deletion because of the compact ordering of
entries. Tag deletion works because tag types are unique and static.
Managing entry deletion in this manner would require static id
allocation, which would cause problems when creating files, running out
of space, and disallow arbitrary insertions of files.
Currently unused, the insertion of new file entries in arbitrary
locations in a metadata-pair is very easy to add into the existing
metadata logging.
The only tricky things:
1. Name tags must strictly precede any tags related to a file. We can
pull this off during a compact, but must make two passes. One for the
name tag, one for the file. Though a benefit of this is that now our
scans during moves can exit early upon finding the name tag.
1. We need to handle name tags appearing out of order. This makes name
tags symmetric to deletes, although it doesn't seem like we can
leverage this fact very well. Note this also means we need to make
the superblock tag a type of name tag.
The valid bit present in tags is a requirement to properly detect the
end of commits in metadata logs. The way it works is that the CRC entry is
allowed to specify what is needed from the next tag's valid bit. If it's
incorrect, we've reached the end of the commit. We then set the valid bit to
indicate when we tried to program a new commit. If we lose power, this
commit will still be thrown out by a bad checksum.
However, the valid bit is unused outside of the CRC entry. Here we turn on the
valid bit for all tags, which means we have a decent chance of exiting early
if we hit a half-written commit. We still need to guarantee detection of
the valid bit on commits following the CRC entry, so we allow the CRC
entry to flip the expected valid bit.
The only tricky part is what valid bit we expect by default, since this
is used on the first commit on a metadata log. Here we default to a 1,
which gives us the fastest exit on blocks that erase to 0. This is
because blocks that erase to 1s will implicitly flip the valid bit of
the next tag, allowing us to exit on the next tag.
If we defaulted to 0, we could exit faster on disks that erase to 1, but
would need to scan the entire block on disks that erase to 0 before we
realize a CRC commit is never coming.
The commit machine in littlefs has three stages: commit, compact, and
then split. First we try to append our commit to the metadata log, if
that fails we try to compact the metadata log to remove duplicates and make
room for the commit, if that still fails we split the metadata into two
metadata-pairs and try again. Each stage is less efficient but also less
frequent.
However, in the case that we're filling up a directory with new files,
such as the bootstrap process in setting up a new system, we must pass
through all three stages rather quickly in order to get enough
metadata-pairs to hold all of our files. This means we'll compact,
split, and then need to compact again. This creates more erases than is
needed in the optimal case, which can be a big cost on disks with an
expensive erase operation.
In theory, we can actually avoid this redundant erase by reusing the
data we wrote out in the first attempt to compact. In practice, this
trick is very complicated to pull off.
1. We may need to cache a half-completed program while we write out the
new metadata-pair. We need to write out the second pair first in
order to get our new tail before we complete our first metadata-pair.
This requires two pcaches, which we don't have
The solution here is to just drop our cache and reconstruct what if
would have been. This needs to be perfect down to the byte level
because we don't have knowledge of where our cache lines are.
2. We may have written out entries that are then moved to the new
metadata-pair.
The solution here isn't pretty but it works, we just add a delete
tag for any entry that was moved over.
In the end the solution ends up a bit hacky, with different layers poked
through the commit logic in order to manage writes at the byte level
from where we manage splits. But it works fairly well and saves erases.
The littlefs driver has always had this really weird quirk: larger cache
sizes can significantly harm performance. This has probably been one of
the most surprising pieces of configuraing and optimizing littlefs.
The reason is that littlefs's caches are kinda dumb (this is somewhat
intentional, as dumb caches take up much less code space than smart
caches). When littlefs needs to read data, it will load the entire cache
line. This means that even when we only need a small 4 byte piece of
data, we may need to read a full 512 byte cache. And since
microcontrollers may be reading from storage over relatively slow bus
protocols, the time to send data over the bus may dominate other
operations.
Now that we have separate configuration options for "cache_size" and
"read_size", we can start making littlefs's caches a bit smarter. They
aren't going to be perfect, because code size is still a priority, but
there are some small improvements we can do:
1. Program caches write to prog_size aligned units, but eagerly cache as
much as possible. There's no downside to using the full cache in
program operations.
2. Add a hint parameter to cached reads. This internal API allows callers
to tell the cache how much data they expect to need. This avoids
excess bus traffic, and now we can even bypass the cache if the
caller provides enough of a buffer.
We can still fall back to reading full cache-lines in the cases where
we don't know how much data we need by providing the block size as
the hint. We do this for directory fetches and for file reads.
This has immediate improvements for both metadata-log traversal and CTZ
skip-list traversal, since these both only need to read 4-byte pointers
and can always bypass the cache, allowing reuse elsewhere.
While ECORRUPT is not a wrong error code, it doesn't match other
instances of hitting a corrupt block during write. During writes, if
blocks are detected as corrupt their data is evicted and moved to a new
clean block. This means that at the end of a disk's lifetime, exhaustion
errors will be reported as ENOSPC when littlefs can't find any new block
to store the data.
This has the benefit of matching behaviour when a new file is written
and no more blocks can be found, due to either a small disk or corrupted
blocks on disk. To littlefs it's like the disk shrinks in size over
time.
The initial implementation of inline files was thrown together fairly
quicky, however it has worked well so far and there hasn't been much
reason to change it.
One shortcut was to trick file writes into thinking they are writing to
imaginary blocks. This works well and reuses most of the file code
paths, as long as we don't flush the imaginary block out to disk.
Initially we did this by limiting inline_max to cache_max-1, ensuring
that the cache never fills up and gets flushed. This was a rather dirty
hack, the better solution, implemented here, is to handle the
representation of an "imaginary" block correctly all the way down into
the cache layer.
So now for files specifically, the value -1 represents a null pointer,
and the value -2 represents an "imaginary" block. This may become a
problem if the number of blocks approaches the max, however this -2
value is never written to disk and can be changed in the future without
breaking compatibility.
Because a block can go bad at any time, if we're unlucky, we may end up
generating multiple orphans in a single metadata write. This is
exacerbated by the early eviction in dynamic wear-leveling.
We can't track _all_ orphans, because that would require unbounded
storage and significantly complicate things, but there are a handful of
intentional orphans we do track because they are easy to resolve without
the O(n^2) deorphan scan. These are anytime we intentionally remove a
metadata-pair.
Initially we cleaned up orphans as they occur with whatever knowledge we
do have, and just accepted the extra O(n^2) deorphan scans in the
unlucky case. However we can do a bit better by being lazy and leaving
deorphaning up to the next metadata write. This needs to work with the known
orphans while still setting the orphan flag on disk correctly. To
accomplish this we replace the internal flag with a small counter.
Note, this means that our internal representation of orphans differs
from what's on disk. This is annoying but not the end of the world.
This was a pretty simple oversight on my part. Conceptually, there's no
difference between lfs_fs_getattr and lfs_getattr("/"). Any operations
on directories can be applied "globally" by referring to the root
directory.
Implementation wise, this actually fixes the "corner case" of storing
attributes on the root directory, which is broken since the root
directory doesn't have a related entry. Instead we need to use the root
superblock for this purpose.
Fewer functions means less code to document and maintain, so this is a
nice benefit. Now we just have a single lfs_getattr/setattr/removeattr set
of functions along with the ability to access attributes atomically in
lfs_file_opencfg.
This implements the second step of full dynamic wear-leveling, block
allocation randomization. This is the key part the uniformly distributes
wear across the filesystem, even through reboots.
The entropy actually comes from the filesystem itself, by xoring
together all of the CRCs in the metadata-pairs on the filesystem. While
this sounds like a ridiculous operation, it's easy to do when we already
scan the metadata-pairs at mount time.
This gives us a random number we can use for block allocation.
Unfortunately it's not a great general purpose random generator as the
output only changes every filesystem write. Fortunately that's exactly
when we need our allocator.
---
Additionally, the randomization created a mess for the testing
framework. Fortunately, this method of randomization is deterministic.
A very useful property for reproducing bugs.
Initially, littlefs relied entirely on bad-block detection for
wear-leveling. Conceptually, at the end of a devices lifespan, all
blocks would be worn evenly, even if they weren't worn out at the same
time. However, this doesn't work for all devices, rather than causing
corruption during writes, wear reduces a devices "sticking power",
causing bits to flip over time. This means for many devices, true
wear-leveling (dynamic or static) is required.
Fortunately, way back at the beginning, littlefs was designed to do full
dynamic wear-leveling, only dropping it when making the retrospectively
short-sighted realization that bad-block detection is theoretically
sufficient. We can enable dynamic wear-leveling with only a few tweaks
to littlefs. These can be implemented without breaking backwards
compatibility.
1. Evict metadata-pairs after a certain number of writes. Eviction in
this case is identical to a relocation to recover from a bad block.
We move our data and stick the old block back into our pool of
blocks.
For knowing when to evict, we already have a revision count for each
metadata-pair which gives us enough information. We add the
configuration option block_cycles and evict when our revision count
is a multiple of this value.
2. Now all blocks participate in COW behaviour. However we don't store
the state of our allocator, so every boot cycle we reuse the first
blocks on storage. This is very bad on a microcontroller, where we
may reboot often. We need a way to spread our usage across the disk.
To pull this off, we can simply randomize which block we start our
allocator at. But we need a random number generator that is different
on each boot. Fortunately we have a great source of entropy, our
filesystem. So we seed our block allocator with a simple hash of the
CRCs on our metadata-pairs. This can be done for free since we
already need to scan the metadata-pairs during mount.
What we end up with is a uniform distribution of wear on storage. The
wear is not perfect, if a block is used for metadata it gets more wear,
and the randomization may not be exact. But we can never actually get
perfect wear-leveling, since we're already resigned to dynamic
wear-leveling at the file level.
With the addition of metadata logging, we end up with a really
interesting two-stage wear-leveling algorithm. At the low-level,
metadata is statically wear-leveled. At the high-level, blocks are
dynamically wear-leveled.
---
This specific commit implements the first step, eviction of metadata
pairs. Entertwining this into the already complicated compact logic was
a bit annoying, however we can combine the logic for superblock
expansion with the logic for metadata-pair eviction.
Found while testing big-endian support. Basically, if littlefs is really
really unlucky, the block allocator could kick in while committing a
file's CTZ reference. If this happens, the block allocator will need to
traverse all CTZ skip-lists in memory, including the skip-list we're
committing. This means we can't convert the CTZ's endianness in place,
and need to make a copy on big-endian systems.
We rely on dead-code elimination from the compiler to make the
conditional behaviour for big-endian vs little-endian system a noop
determined by the lfs_tole32 intrinsic.
In looking at the common CRC APIs out there, this seemed the most
common. At least more common than the current modified-in-place pointer
API. It also seems to have a slightly better code footprint. I'm blaming
pointer optimization issues.
One downside is that lfs_crc can't report errors, however it was already
assumed that lfs_crc can not error.
This was mostly tweaking test cases to be accommodating for variable
sized superblock-lists. Though there were a few bugs that needed fixing:
- Changed compact to use source dir for move since the original dir
could have changed as a result of an expand.
- Created copy of current directory so we don't overwrite ourselves
during an internal commit update.
Also made sure all of the test suites provide reproducable results when
ran independently (the entry tests were behaving differently based on
which tests were ran before).
(Some where legitimate test failures)
In v1, littlefs didn't trust blocks that were been previously erased and
conservatively erased any blocks before writing to them. This was a part
of the design since the beginning because of the complexity of managing
erased blocks when we can lose power at any time.
However, we theoretically could keep track of files that have been
properly erased by marking them with an "erased bit". A file marked this
way could be opened and appended to without needing to COW the last
block. The requirement would be that the "erased bit" is cleared during
a write, since a power-loss would require that littlefs no longer trust
the erased state of the file.
This commit just shuffles the struct types around to make space for an
"erased bit" in the struct type field to be added in the future. This
ordering also makes more sense, since there will likely be more file
representations than directory representations on disk.
Expanding superblocks has been on my wishlist for a while. The basic
idea is that instead of maintaining a fixed offset blocks {0, 1} to the
the root directory (1 pointer), we maintain a dynamically sized
linked-list of superblocks that point to the actual root. If the number
of writes to the root exceeds some value, we increase the size of the
superblock linked-list.
This can leverage existing metadata-pair operations. The revision count for
metadata-pairs provides some knowledge on how much wear we've put on the
superblock, and the threaded linked-list can also be reused for this
purpose. This means superblock expansion is both optional and cheap to
implement.
Expanding superblocks helps both extremely small and extremely large filesystem
(extreme being relative of course). On the small end, we can actually
collapse the superblock into the root directory and drop the hard requirement
of 4-blocks for the superblock. On the large end, our superblock will
now last longer than the rest of the filesystem. Each time we expand,
the number of cycles until the superblock dies is increased by a power.
Before we were stuck with this layout:
level cycles limit layout
1 E^2 390 MiB s0 -> root
Now we expand every time a fixed offset is exceeded:
level cycles limit layout
0 E 4 KiB s0+root
1 E^2 390 MiB s0 -> root
2 E^3 37 TiB s0 -> s1 -> root
3 E^4 3.6 EiB s0 -> s1 -> s2 -> root
...
Where the cycles are the number of cycles before death, and the limit is
the worst-case size a filesystem where early superblock death becomes a
concern (all writes to root using this formula: E^|s| = E*B, E = erase
cycles = 100000, B = block count, assuming 4096 byte blocks).
Note we can also store copies of the superblock entry on the expanded
superblocks. This may help filesystem recover tools in the future.
This is a downside caused by relying on and external repo for testing,
but also storing the CI configuration inside this repo. Fortunately we
can use a temporary v2-alpha branch in the FUSE repo mirroring the
v2-alpha branch for testing.
LFS_ERR_CORRUPT is unfortunately not a well defined error code. It's
very important in the context of littlefs, but missing from the standard
error codes defined in Linux.
After some discussions with other developers, it was encouraged to use
the encoding for EILSEQ over EBADE for representing on disk corrupt, as
EILSEQ implies that there is something wrong with the data.
I've changed this now to take advantage of the breaking changes in v2 to
avoid a risky change to a return value.
Because of limitations in how littlefs manages attributes on disk,
littlefs views zero-length attributes and missing attributes as the same
thing. The simpliest implementation of attributes mirrors this behaviour
transparently for the user.
State stealing is a tricky part of managing the xored-globals. When
removing a metadata-pair from the metadata chain, whichever
metadata-pair does the removing is also responsible for stealing the
removed metadata-pair's global delta and incorporating it into it's own
global delta. Otherwise the global state would become corrupted.
- Updated documentation where needed
- Added asserts which take into account relationships with the new
cache_size configuration
- Restructured ordering to be consistent for the three main
configurables: LFS_ATTR_MAX, LFS_NAME_MAX, and LFS_INLINE_MAX
The introduction of an explicit cache_size configuration allows
customization of the cache buffers independently from the hardware
read/write sizes.
This has been one of littlefs's main handicaps. Without a distinction
between cache units and hardware limitations, littlefs isn't able to
read or program _less_ than the cache size. This leads to the
counter-intuitive case where larger cache sizes can actually be harmful,
since larger read/prog sizes require sending more data over the bus if
we're only accessing a small set of data (for example the CTZ skip-list
traversal).
This is compounded with metadata logging, since a large program size
limits the number of commits we can write out in a single metadata
block. It really doesn't make sense to link program size + cache
size here.
With a separate cache_size configuration, we can be much smarter about
what we actually read/write from disk.
This also simplifies cache handling a bit. Before there were two
possible cache sizes, but these were rarely used. Note that the
cache_size is NOT written to the superblock and can be freely changed
without breaking backwards compatibility.
There wasn't much use (and inconsistent compiler support) for storing
small values next to the unaligned lfs_global_t struct. So instead, I've
rounded the struct up to the nearest word to try to take advantage of
the alignment in xor and memset operations.
I've also moved the global fetching into lfs_mount, since that was the
only use of the operation. This allows for some variable reuse in the
mount function.
This is an effort to try to consolidate the handling of in-flight files
and dirs opened by the user (and possibly opened internally). Both files
and dirs have metadata state that need to be kept in sync by the commit
logic.
This metadata state is mostly contained in the lfs_mdir_t type, which is
present in both the lfs_file_t and lfs_dir_t. Unfortunately both of
these structs have some relatively unrelated metadata that needs to be
kept in sync:
- Files store an id representing the open file
- Dirs store an id during iteration
While these take up the same space, they unfortunately need to be
managed differently by the commit logic.
The best solution I can come up with is to simple store a general
purpose list and tag both structures with LFS_TYPE_REG and LFS_TYPE_DIR
respectively. This is kinda funky, but wins out over duplicated the
commit logic.
Other than removed outdated TODOs, there are several tweaks:
- Standardized naming of fs-level functions (mostly internal names)
- Tweaked low-level use of subtype to hopefully take advantage of
redundant code removal
- Moved root-handling into lfs_dir_getinfo
- Updated DEBUG statements around move/orphan fixes
- Removed trailing 1s in type fields
- Removed unused code
Unfortunately for us, even with the new ability to store global state,
orphans can not be handled as gracefully as moves. This is due to the
fact that directory operations can create an unbounded number of
orphans. It's usually small, the fact that it's unbounded means we can't
store the orphan info in xored-globals.
However, one thing we can do to leverage the xored-global state is store
a bit indicating if _any_ orphans are present. This means in the common
case we can completely avoid the deorphan step, while only using a
single bit of the global state, which is effectively free since we can
store it in the globals tag itself.
If a littlefs drive does not want to consider the orphan bit, it's free
to use the previous behaviour of always checking for orphans on first
write.
Result of testing on zero-granularity blocks, where the prog size and
read size equals the block size. This represents SD cards and other
traditional forms of block storage where we don't really get a benefit
from the metadata logging.
Unfortunately, since updates in both are tested by the same script,
we can't really use simple bash commands. Added a more complex
script to simulate corruption. Fortunately this should be more robust
than the previous solutions.
The main fixes were around corner cases where the commit logic fell
apart when it didn't have room to complete commits, but these were
fixable in the current design.
The main change here was to drop the in-place twiddling of custom
attributes to match the internal attribute structures. The original
thought was that this could allow the compiler to garbage collect more
of the custom attribute logic when not used, but since this occurs in
the common lfs_file_opencfg function, gc can't really happen.
Not twiddling the user's structure is the polite thing to do, opens up
the ability to store the lfs_attr structure in ROM, and avoids surprising
the user if they attempt to use the structure for their own purposes.
This means we can make the lfs_attr structure const and rely on the list
in the lfs_file_config structure, similar to how we rely on the global
lfs_config structure.
Some other tweaks:
- Dropped the global file_buffer, replaced entirely by per-file buffers.
- Updated LFS_INLINE_MAX and LFS_ATTR_MAX to correct values
- Added workaround for compiler bug related to zero initializer:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119
This is a very minor thing but it has been bugging me. On one hand, all
a callback ever needs is a single pointer for context. On the other
hand, you could make the argument that in the context of littlefs, the
lfs_t struct represents global state and should always be available to
callbacks passed to littlefs.
In the end I'm sticking with only a single context pointer, since this
is satisfies the minimum requirements and has the highest chance of
function reuse. If a user needs access to the lfs_t struct, it can be
passed by reference in the context provided to the callback.
This also matches callbacks used in other languages with more emphasis
on objects and classes. Usually the callback doesn't get a reference to
the caller.
The main thing to consider was how lfs_dir_fetchwith reacts to
corruption it finds and to make sure falling back to old values works
correctly.
Some of the tricky bits involved making sure we could fall back to both old
commits and old metadata blocks while still handling things like
synthetic moves correctly.
The main issue here was that the old orphan test relied on deleting the
block that contained the most recent update. In the new design this
doesn't really work since updates get appended to metadata-pairs
incrementally.
This is fixed by instead using the truncate command on the appropriate
block. We're now passing orphan tests.
Now that littlefs has been rebuilt almost from the ground up with the
intention to support custom attributes, adding in custom attribute
support is relatively easy.
The highest bit in the 9-bit type structure indicates that an attribute
is a user-specified custom attribute. The user then has a full 8-bits to
specify the attribute type. Other than that, custom attributes are
treated the same as system-level attributes.
Also made some tweaks to custom attributes:
- Adopted the opencfg for file-level attributes provided by dpgeorge
- Changed setattrs/getattrs to the simpler setattr/getattr functions
users will probably be more familiar with. Note that multiple
attributes can still be committed atomically with files, though not
with directories.
- Changed LFS_ATTRS_MAX -> LFS_ATTR_MAX since there's no longer a global
limit on the sum of attribute sizes, which was rather confusing.
Though they are still limited by what can fit in a metadata-pair.
Updated to account for changes as a result of commits/compacts. And
changed instances of iteration over both files and dirs to use a single
nested loop.
This does rely implicitly on the structure layout of dirs/files and
their location in lfs_t, which isn't great. But it gets the job done
with less code duplication.
This was a surprisingly tricky issue. One of the subtle requirements for
the new move handling to work is that the block containing the move does
not change until the move is resolved. Initially, this seemed easy to
implement, given that a move is always immediately followed by its
resolution.
However, the extra metadata-pair operations needed to maintain integrity
present a challenge. At any commit, a directory block may end up moved
as a side effect of relocation due to a bad block.
The fix here is to move the move resolution directly into the commit
logic. This means that any commit to a block containing a move will be
implicitly resolved, leaving the later attempt at move resolution as a
noop.
This fix required quite a bit of restructuring, but as a nice
side-effect some of the complexity around moves actually went away.
Additionally, the new move handling is surprisingly powerful at
combining moves with nearby commits. And we now get same-metadata-pair
renames for free! A win for procrasination on that minor feature.
Restrctured function organization to make a bit more sense, and made
some small refactoring tweaks, specifically around the commit logic and
global related functions.
The biggest change here is to make littlefs less obsessed with the
lfs_mattr_t struct. It was limiting our flexibility and can be entirely
replaced by passing the tag + data explicitly. The remaining use of
lfs_mattr_t is specific to the commit logic, where it replaces the
lfs_mattrlist_t struct.
Other changes:
- Added global lfs_diskoff struct for embedding disk references inside
the lfs_mattr_t.
- Reordered lfs_mattrlist_t to squeeze out some code savings
- Added commit_get for explicit access to entries from unfinished
metadata-pairs
- Parameterized the "stop_at_commit" flag instead of hackily storing it
in the lfs_mdir_t temporarily
- Changed return value of lfs_pred to error-only with ENOENT representing
a missing predecessor
- Adopted const where possible
One neat (if gimmicky) trick, is that each tag has a valid bit in the
highest bit position of the 32-bit word. This is used to determine when
to stop a fetch operation, but after fetch, the bit is free to use in
the driver. This means we can create a typed-union of sorts with error
codes and tags, returning both as the return value from a function.
Say what you will about this trick, it does have a significant impact on
code size. I suspect this is primarily due to the compiler having a hard
time optimizing around pointer access.
While it makes sense to reuse as many code paths as possible, it turns
out that the logic behind the traversal of littlefs's metadata-pairs is
so simple that it's actually cheaper to duplicate the traversal code
where needed.
This means instead of the code path move -> traverse -> movescan -> get
-> traverse -> getscan, we can use the relatively flatter code path of
move -> get.
Tags offer all of the necessary info from the find functions (which
makes sense, this is the structure that stores the info on disk).
Passing around a single tag instead of separate id and type fields
simplifies the internal functions while leverages the tag's compactness.
Unfortunately, the three different sets of get functions were not
contributing very much, and having three different get functions means
we may be wasting code on redundant code paths.
By dropping the user of the lfs_mattr_t struct in favor of a buffer, we
can combine the three code paths with a bit of tweaking.