commit
a1efa582eb
8
docs/proposals/README.md
Normal file
8
docs/proposals/README.md
Normal file
@ -0,0 +1,8 @@
|
|||||||
|
# Git LFS Proposals
|
||||||
|
|
||||||
|
This directory contains high level proposals for future Git LFS features.
|
||||||
|
Inclusion here does not guarantee when or if a feature will make it in to Git
|
||||||
|
LFS. It doesn't even guarantee that the specifics won't change.
|
||||||
|
|
||||||
|
Everyone is welcome to submit their own proposal as a markdown file in a
|
||||||
|
pull request for discussion.
|
@ -1,6 +1,6 @@
|
|||||||
# Extending LFS
|
# Extending LFS
|
||||||
|
|
||||||
Teams who use Git LFS often have custom requirements for how the pointer files and
|
Teams who use Git LFS often have custom requirements for how the pointer files and
|
||||||
blobs should be handled. Some examples of extensions that could be built:
|
blobs should be handled. Some examples of extensions that could be built:
|
||||||
|
|
||||||
* Compress large files on clean, uncompress them on smudge/fetch
|
* Compress large files on clean, uncompress them on smudge/fetch
|
||||||
@ -8,18 +8,18 @@ blobs should be handled. Some examples of extensions that could be built:
|
|||||||
* Scan files on clean to make sure they don't contain sensitive information
|
* Scan files on clean to make sure they don't contain sensitive information
|
||||||
|
|
||||||
The basic extensibilty model is that LFS extensions must be registered explicitly, and
|
The basic extensibilty model is that LFS extensions must be registered explicitly, and
|
||||||
they will be invoked on clean and smudge to manipulate the contents of the files as
|
they will be invoked on clean and smudge to manipulate the contents of the files as
|
||||||
needed. On clean, LFS itself ensures that the pointer file is updated with all the
|
needed. On clean, LFS itself ensures that the pointer file is updated with all the
|
||||||
information needed to be able to smudge correctly, and the extensions never modify the
|
information needed to be able to smudge correctly, and the extensions never modify the
|
||||||
pointer file directly.
|
pointer file directly.
|
||||||
|
|
||||||
Note that LFS is currently transitioning away from using the Git smudge filter, in favor
|
Note that LFS is currently transitioning away from using the Git smudge filter, in favor
|
||||||
of smudging all files using "git-lfs fetch" post checkout. However, that detail should
|
of smudging all files using "git-lfs fetch" post checkout. However, that detail should
|
||||||
be transparent to extensions, since they are still invoked on a per-file basis.
|
be transparent to extensions, since they are still invoked on a per-file basis.
|
||||||
|
|
||||||
## Registration
|
## Registration
|
||||||
|
|
||||||
To register an LFS extension, it must be added to the Git config. Each extension needs
|
To register an LFS extension, it must be added to the Git config. Each extension needs
|
||||||
to define:
|
to define:
|
||||||
|
|
||||||
* Its unique name. This will be used as part of the key in the pointer file.
|
* Its unique name. This will be used as part of the key in the pointer file.
|
||||||
@ -27,7 +27,7 @@ to define:
|
|||||||
* The command to run on smudge/fetch
|
* The command to run on smudge/fetch
|
||||||
* The priority of the extension, which must be a unique, non-negative integer
|
* The priority of the extension, which must be a unique, non-negative integer
|
||||||
|
|
||||||
The sequence "%f" in the clean and smudge commands will be replaced by the filename being
|
The sequence "%f" in the clean and smudge commands will be replaced by the filename being
|
||||||
processed.
|
processed.
|
||||||
|
|
||||||
Here's an example extension registration in the Git config:
|
Here's an example extension registration in the Git config:
|
||||||
@ -45,45 +45,45 @@ Here's an example extension registration in the Git config:
|
|||||||
|
|
||||||
## Clean
|
## Clean
|
||||||
|
|
||||||
When staging a file, Git invokes the LFS clean filter, as described earlier. If no
|
When staging a file, Git invokes the LFS clean filter, as described earlier. If no
|
||||||
extensions are installed, the LFS clean filter reads bytes from STDIN, calculates the
|
extensions are installed, the LFS clean filter reads bytes from STDIN, calculates the
|
||||||
SHA-256 signature, and writes the bytes to a temp file. It then moves the temp file into
|
SHA-256 signature, and writes the bytes to a temp file. It then moves the temp file into
|
||||||
the appropriate place in .git/lfs/objects and writes a valid pointer file to STDOUT.
|
the appropriate place in .git/lfs/objects and writes a valid pointer file to STDOUT.
|
||||||
|
|
||||||
When an extension is installed, LFS will invoke the extension to do additional processing
|
When an extension is installed, LFS will invoke the extension to do additional processing
|
||||||
on the bytes before writing them into the temp file. If multiple extensions are
|
on the bytes before writing them into the temp file. If multiple extensions are
|
||||||
installed, they are invoked in the order defined by their priority. LFS will also insert
|
installed, they are invoked in the order defined by their priority. LFS will also insert
|
||||||
a key in the pointer file for each extension that was invoked, indicating both the order
|
a key in the pointer file for each extension that was invoked, indicating both the order
|
||||||
that the extension was invoked and the oid of the file before that extension was invoked.
|
that the extension was invoked and the oid of the file before that extension was invoked.
|
||||||
All of that information is required to be able to reliably smudge the file later. Each
|
All of that information is required to be able to reliably smudge the file later. Each
|
||||||
new line in the pointer file will be of the form
|
new line in the pointer file will be of the form
|
||||||
|
|
||||||
`ext-{priority}-{name} {hash-method}:{hash-of-input-to-extension} `
|
`ext-{priority}-{name} {hash-method}:{hash-of-input-to-extension} `
|
||||||
|
|
||||||
This naming ensures that all extensions are written in both alphabetical and priority
|
This naming ensures that all extensions are written in both alphabetical and priority
|
||||||
order, and also shows the progression of changes to the oid as it is processed by the
|
order, and also shows the progression of changes to the oid as it is processed by the
|
||||||
extensions.
|
extensions.
|
||||||
|
|
||||||
Here's an example sequence, assuming extensions foo and bar are installed, as shown in
|
Here's an example sequence, assuming extensions foo and bar are installed, as shown in
|
||||||
the previous section.
|
the previous section.
|
||||||
|
|
||||||
* Git passes the original contents of the file to LFS clean over STDIN
|
* Git passes the original contents of the file to LFS clean over STDIN
|
||||||
* LFS reads those bytes and calculates the original SHA-256 signature as it does so
|
* LFS reads those bytes and calculates the original SHA-256 signature as it does so
|
||||||
* LFS streams the bytes to STDIN of lfs-extension.foo.clean, which is expected to write
|
* LFS streams the bytes to STDIN of lfs-ext.foo.clean, which is expected to write
|
||||||
those bytes, modified or not, to its STDOUT
|
those bytes, modified or not, to its STDOUT
|
||||||
* LFS reads the bytes from STDOUT of lfs-extension.foo.clean, calculates the SHA-256
|
* LFS reads the bytes from STDOUT of lfs-ext.foo.clean, calculates the SHA-256
|
||||||
signature, and writes them to STDIN of lfs-extension.bar.clean, which then writes those
|
signature, and writes them to STDIN of lfs-ext.bar.clean, which then writes those
|
||||||
bytes, modified or not, to its STDOUT
|
bytes, modified or not, to its STDOUT
|
||||||
* LFS reads the bytes from STDOUT of lfs-extension.bar.clean, calculates the SHA-256
|
* LFS reads the bytes from STDOUT of lfs-ext.bar.clean, calculates the SHA-256
|
||||||
signature, and writes the bytes to a temp flie
|
signature, and writes the bytes to a temp flie
|
||||||
* When finished, LFS atomically moves the temp file into .git/lfs/objects, as before
|
* When finished, LFS atomically moves the temp file into .git/lfs/objects, as before
|
||||||
* LFS generates the pointer file, with some changes:
|
* LFS generates the pointer file, with some changes:
|
||||||
* The oid and size keys are calculated from the final bytes written into the LFS storage
|
* The oid and size keys are calculated from the final bytes written into the LFS storage
|
||||||
* LFS also writes keys named extension-1-foo and extension-2-bar into the pointer, along
|
* LFS also writes keys named extension-1-foo and extension-2-bar into the pointer, along
|
||||||
with their respective input oid's
|
with their respective input oid's
|
||||||
|
|
||||||
Here's an example pointer file, for a file processed by extensions foo and bar:
|
Here's an example pointer file, for a file processed by extensions foo and bar:
|
||||||
|
|
||||||
```
|
```
|
||||||
version https://git-lfs.github.com/spec/v1
|
version https://git-lfs.github.com/spec/v1
|
||||||
ext-1-foo sha256:{original hash}
|
ext-1-foo sha256:{original hash}
|
||||||
@ -93,9 +93,9 @@ size 123
|
|||||||
(ending \n)
|
(ending \n)
|
||||||
```
|
```
|
||||||
|
|
||||||
Note: as an optimization, if an extension just does a pass-through, its key can be
|
Note: as an optimization, if an extension just does a pass-through, its key can be
|
||||||
omitted from the pointer file. This will make smudging the file a bit more efficient
|
omitted from the pointer file. This will make smudging the file a bit more efficient
|
||||||
since that extension can be skipped. LFS can detect a pass-through extension because the
|
since that extension can be skipped. LFS can detect a pass-through extension because the
|
||||||
input and output oid's will be the same.
|
input and output oid's will be the same.
|
||||||
|
|
||||||
This implies that extensions must have no side effects other than writing to their STDOUT.
|
This implies that extensions must have no side effects other than writing to their STDOUT.
|
||||||
@ -104,48 +104,48 @@ Otherwise LFS has no way to know what extensions modified a file.
|
|||||||
|
|
||||||
## Smudge
|
## Smudge
|
||||||
|
|
||||||
When a file is checked out, Git invokes the LFS smudge filter, as described earlier. If
|
When a file is checked out, Git invokes the LFS smudge filter, as described earlier. If
|
||||||
no extensions are installed, the LFS smudge filter inspects the first 100 bytes of the
|
no extensions are installed, the LFS smudge filter inspects the first 100 bytes of the
|
||||||
bytes off STDIN, and if it is a pointer file, uses the oid to find the correct object in
|
bytes off STDIN, and if it is a pointer file, uses the oid to find the correct object in
|
||||||
the LFS storage, and writes those bytes to STDOUT so that Git can write them to the
|
the LFS storage, and writes those bytes to STDOUT so that Git can write them to the
|
||||||
working directory.
|
working directory.
|
||||||
|
|
||||||
If the pointer file indicates that extensions were invoked on that file, then those
|
If the pointer file indicates that extensions were invoked on that file, then those
|
||||||
extensions must be installed in order to smudge. If they are not installed, not found,
|
extensions must be installed in order to smudge. If they are not installed, not found,
|
||||||
or unusable for any reason, LFS will fail to smudge the file, and outputs an error
|
or unusable for any reason, LFS will fail to smudge the file, and outputs an error
|
||||||
indicating which extension is missing.
|
indicating which extension is missing.
|
||||||
|
|
||||||
Each of the extensions indicated in the pointer file must be invoked in reverse order to
|
Each of the extensions indicated in the pointer file must be invoked in reverse order to
|
||||||
undo the changes they made to the contents of the file. After each extension is invoked,
|
undo the changes they made to the contents of the file. After each extension is invoked,
|
||||||
LFS will compare the SHA-256 signature of the bytes output by the extension with the oid
|
LFS will compare the SHA-256 signature of the bytes output by the extension with the oid
|
||||||
stored in the pointer file as the original input to that same extension. Those
|
stored in the pointer file as the original input to that same extension. Those
|
||||||
signatures must match, otherwise the extension did not undo its changes correctly. In
|
signatures must match, otherwise the extension did not undo its changes correctly. In
|
||||||
that case, LFS fails to smudge the file, and outputs an error indicating which extension
|
that case, LFS fails to smudge the file, and outputs an error indicating which extension
|
||||||
is failing.
|
is failing.
|
||||||
|
|
||||||
Here's an example sequence, indicating how LFS will smudge the pointer file shown in the
|
Here's an example sequence, indicating how LFS will smudge the pointer file shown in the
|
||||||
previous section:
|
previous section:
|
||||||
|
|
||||||
* Git passes the bytes of the pointer file to LFS smudge over STDIN. Note that when
|
* Git passes the bytes of the pointer file to LFS smudge over STDIN. Note that when
|
||||||
using "git lfs fetch", LFS reads the files directly from disk rather than off STDIN. The
|
using "git lfs fetch", LFS reads the files directly from disk rather than off STDIN. The
|
||||||
rest of the steps are unaffected either way.
|
rest of the steps are unaffected either way.
|
||||||
* LFS reads those bytes and inspects them to see if this is a pointer file. If it was
|
* LFS reads those bytes and inspects them to see if this is a pointer file. If it was
|
||||||
not, the bytes would just be passed through to STDOUT.
|
not, the bytes would just be passed through to STDOUT.
|
||||||
* Since it is a pointer file, LFS reads the whole file off STDIN, parses it, and
|
* Since it is a pointer file, LFS reads the whole file off STDIN, parses it, and
|
||||||
determines that extensions foo and bar both processed the file, in that order.
|
determines that extensions foo and bar both processed the file, in that order.
|
||||||
* LFS uses the value of the oid key to find the blob in the .git/lfs/objects folder, or
|
* LFS uses the value of the oid key to find the blob in the .git/lfs/objects folder, or
|
||||||
download from the server as needed
|
download from the server as needed
|
||||||
* LFS writes the contents of the blob to STDIN of lfs-extension.bar.smudge, which
|
* LFS writes the contents of the blob to STDIN of lfs-ext.bar.smudge, which
|
||||||
modifies them as needed and writes them to its STDOUT
|
modifies them as needed and writes them to its STDOUT
|
||||||
* LFS reads the bytes from STDOUT of lfs-extension.bar.smudge, calculates the SHA-256
|
* LFS reads the bytes from STDOUT of lfs-ext.bar.smudge, calculates the SHA-256
|
||||||
signature, and writes the bytes to STDIN of lfs-extension.foo.smudge, which modifies them
|
signature, and writes the bytes to STDIN of lfs-ext.foo.smudge, which modifies them
|
||||||
as needed and writes to them its STDOUT
|
as needed and writes to them its STDOUT
|
||||||
* LFS reads the bytes from STDOUT of lfs-extension.foo.smudge, calculates the SHA-256
|
* LFS reads the bytes from STDOUT of lfs-ext.foo.smudge, calculates the SHA-256
|
||||||
signature, and writes the bytes to its own STDOUT
|
signature, and writes the bytes to its own STDOUT
|
||||||
* At the end, ensure that the hashes calculated on the outputs of foo and bar match their
|
* At the end, ensure that the hashes calculated on the outputs of foo and bar match their
|
||||||
corresponding input hashes from the pointer file. If not, write a descriptive error
|
corresponding input hashes from the pointer file. If not, write a descriptive error
|
||||||
message indicating which extension failed to undo its changes.
|
message indicating which extension failed to undo its changes.
|
||||||
* Question: On error, should we overwrite the file in the working directory with the
|
* Question: On error, should we overwrite the file in the working directory with the
|
||||||
original pointer file? Can this be done reliably?
|
original pointer file? Can this be done reliably?
|
||||||
|
|
||||||
|
|
||||||
@ -176,4 +176,4 @@ error message to its STDERR. Because the file was not smudged correctly, LFS ca
|
|||||||
that file in the working directory. LFS will ensure that the pointer file is written to
|
that file in the working directory. LFS will ensure that the pointer file is written to
|
||||||
both the index and working directory. In addition, it will display the error messages for
|
both the index and working directory. In addition, it will display the error messages for
|
||||||
any files that could not be smudged (and keep those errors in a log), so that the user can
|
any files that could not be smudged (and keep those errors in a log), so that the user can
|
||||||
diagnose the failure and then rerun "git-lfs fetch" to fix up any remaining pointer files.
|
diagnose the failure and then rerun "git-lfs fetch" to fix up any remaining pointer files.
|
||||||
|
Loading…
Reference in New Issue
Block a user