2016-07-07 15:43:56 +00:00
|
|
|
# Adding Custom Transfer Agents to LFS
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
## Introduction
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
Git LFS supports multiple ways to transfer (upload and download) files. In the
|
2016-07-06 16:33:08 +00:00
|
|
|
core client, the basic way to do this is via a one-off HTTP request via the URL
|
|
|
|
returned from the LFS API for a given object. The core client also supports
|
|
|
|
extensions to allow resuming of downloads (via `Range` headers) and uploads (via
|
2016-07-07 15:43:56 +00:00
|
|
|
the [tus.io](http://tus.io) protocol).
|
2016-07-06 16:33:08 +00:00
|
|
|
|
|
|
|
Some people might want to be able to transfer content in other ways, however.
|
2017-07-20 14:09:14 +00:00
|
|
|
To enable this, git-lfs allows configuring Custom Transfers, which are
|
2016-07-06 16:33:08 +00:00
|
|
|
simply processes which must adhere to the protocol defined later in this
|
2017-04-23 08:08:24 +00:00
|
|
|
document. git-lfs will invoke the process at the start of all transfers,
|
2016-07-06 16:33:08 +00:00
|
|
|
and will communicate with the process via stdin/stdout for each transfer.
|
|
|
|
|
2017-07-20 14:09:14 +00:00
|
|
|
## Custom Transfer Type Selection
|
|
|
|
|
|
|
|
In the LFS API request, the client includes a list of transfer types it
|
|
|
|
supports. When replying, the API server will pick one of these and make any
|
|
|
|
necessary adjustments to the returned object actions, in case the the picked
|
|
|
|
transfer type needs custom details about how to do each transfer.
|
|
|
|
|
|
|
|
## Using a Custom Transfer Type without the API server
|
|
|
|
|
|
|
|
In some cases the transfer agent can figure out by itself how and where
|
|
|
|
the transfers should be made, without having to query the API server.
|
|
|
|
In this case it's possible to use the custom transfer agent directly,
|
|
|
|
without querying the server, by using the following config option:
|
|
|
|
|
|
|
|
* `lfs.standalonetransferagent`
|
|
|
|
|
|
|
|
Allows the specified custom transfer agent to be used directly
|
|
|
|
for transferring files, without asking the server how the transfers
|
|
|
|
should be made. The custom transfer agent has to be defined in a
|
|
|
|
`lfs.customtransfer.<name>` settings group.
|
|
|
|
|
|
|
|
## Defining a Custom Transfer Type
|
2016-07-06 16:33:08 +00:00
|
|
|
|
|
|
|
A custom transfer process is defined under a settings group called
|
2017-04-23 08:07:37 +00:00
|
|
|
`lfs.customtransfer.<name>`, where `<name>` is an identifier (see
|
2016-07-07 15:43:56 +00:00
|
|
|
[Naming](#naming) below).
|
2016-07-06 16:33:08 +00:00
|
|
|
|
|
|
|
* `lfs.customtransfer.<name>.path`
|
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
`path` should point to the process you wish to invoke. This will be invoked at
|
2017-04-23 08:18:24 +00:00
|
|
|
the start of all transfers (possibly many times, see the `concurrent` option
|
2016-07-07 15:43:56 +00:00
|
|
|
below) and the protocol over stdin/stdout is defined below in the
|
|
|
|
[Protocol](#protocol) section.
|
2017-04-23 08:08:24 +00:00
|
|
|
|
2016-07-06 16:33:08 +00:00
|
|
|
* `lfs.customtransfer.<name>.args`
|
|
|
|
|
|
|
|
If the custom transfer process requires any arguments, these can be provided
|
|
|
|
here. Typically you would only need this if your process was multi-purpose or
|
|
|
|
particularly flexible, most of the time you won't need it.
|
|
|
|
|
|
|
|
* `lfs.customtransfer.<name>.concurrent`
|
|
|
|
|
|
|
|
If true (the default), git-lfs will invoke the custom transfer process
|
|
|
|
multiple times in parallel, according to `lfs.concurrenttransfers`, splitting
|
|
|
|
the transfer workload between the processes.
|
|
|
|
|
|
|
|
If you would prefer that only one instance of the transfer process is invoked,
|
2017-04-23 08:08:24 +00:00
|
|
|
maybe because you want to do your own parallelism internally (e.g. slicing
|
2016-07-06 16:33:08 +00:00
|
|
|
files into parts), set this to false.
|
|
|
|
|
|
|
|
* `lfs.customtransfer.<name>.direction`
|
|
|
|
|
2017-04-23 08:08:24 +00:00
|
|
|
Specifies which direction the custom transfer process supports, either
|
2017-04-23 08:18:24 +00:00
|
|
|
`download`, `upload`, or `both`. The default if unspecified is `both`.
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
## Naming
|
2016-07-06 16:33:08 +00:00
|
|
|
|
|
|
|
Each custom transfer must have a name which is unique to the underlying
|
|
|
|
mechanism, and the client and the server must agree on that name. The client
|
|
|
|
will advertise this name to the server as a supported transfer approach, and if
|
|
|
|
the server supports it, it will return relevant object action links. Because
|
|
|
|
these may be very different from standard HTTP URLs it's important that the
|
|
|
|
client and server agree on the name.
|
|
|
|
|
|
|
|
For example, let's say I've implemented a custom transfer process which uses
|
2017-04-23 08:18:24 +00:00
|
|
|
NFS. I could call this transfer type `nfs` - although it's not specific to my
|
2016-07-06 16:33:08 +00:00
|
|
|
configuration exactly, it is specific to the way NFS works, and the server will
|
|
|
|
need to give me different URLs. Assuming I define my transfer like this, and the
|
|
|
|
server supports it, I might start getting object action links back like
|
2016-07-07 15:43:56 +00:00
|
|
|
`nfs://<host>/path/to/object`
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
## Protocol
|
2016-07-06 16:33:08 +00:00
|
|
|
|
|
|
|
The git-lfs client communicates with the custom transfer process via the stdin
|
|
|
|
and stdout streams. No file content is communicated on these streams, only
|
|
|
|
request / response metadata. The metadata exchanged is always in JSON format.
|
|
|
|
External files will be referenced when actual content is exchanged.
|
|
|
|
|
2017-04-23 08:08:24 +00:00
|
|
|
### Line Delimited JSON
|
2017-04-23 08:21:03 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
Because multiple JSON messages will be exchanged on the same stream it's useful
|
|
|
|
to delimit them explicitly rather than have the parser find the closing `}` in
|
|
|
|
an arbitrary stream, therefore each JSON structure will be sent and received on
|
|
|
|
a **single line** as per [Line Delimited
|
|
|
|
JSON](https://en.wikipedia.org/wiki/JSON_Streaming#Line_delimited_JSON_2).
|
|
|
|
|
|
|
|
In other words when git-lfs sends a JSON message to the custom transfer it will
|
|
|
|
be on a single line, with a line feed at the end. The transfer process must
|
|
|
|
respond the same way by writing a JSON structure back to stdout with a single
|
|
|
|
line feed at the end (and flush the output).
|
|
|
|
|
|
|
|
### Protocol Stages
|
|
|
|
|
2016-07-06 16:33:08 +00:00
|
|
|
The protocol consists of 3 stages:
|
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
#### Stage 1: Intiation
|
2016-07-06 16:33:08 +00:00
|
|
|
|
|
|
|
Immediately after invoking a custom transfer process, git-lfs sends initiation
|
|
|
|
data to the process over stdin. This tells the process useful information about
|
2016-07-07 15:43:56 +00:00
|
|
|
the configuration.
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
The message will look like this:
|
2017-04-23 08:21:03 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "init", "operation": "download", "concurrent": true, "concurrenttransfers": 3 }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
* `event`: Always `init` to identify this message
|
|
|
|
* `operation`: will be `upload` or `download` depending on transfer direction
|
2016-07-07 15:43:56 +00:00
|
|
|
* `concurrent`: reflects the value of `lfs.customtransfer.<name>.concurrent`, in
|
|
|
|
case the process needs to know
|
|
|
|
* `concurrenttransfers`: reflects the value of `lfs.concurrenttransfers`, for if
|
|
|
|
the transfer process wants to implement its own concurrency and wants to
|
|
|
|
respect this setting.
|
|
|
|
|
|
|
|
The transfer process should use the information it needs from the intiation
|
2016-07-06 16:33:08 +00:00
|
|
|
structure, and also perform any one-off setup tasks it needs to do. It should
|
2016-07-12 14:06:02 +00:00
|
|
|
then respond on stdout with a simple empty confirmation structure, as follows:
|
2016-07-07 15:43:56 +00:00
|
|
|
|
|
|
|
```json
|
2016-07-12 14:06:02 +00:00
|
|
|
{ }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
Or if there was an error:
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
```json
|
|
|
|
{ "error": { "code": 32, "message": "Some init failure message" } }
|
|
|
|
```
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
#### Stage 2: 0..N Transfers
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2017-01-10 10:59:29 +00:00
|
|
|
After the initiation exchange, git-lfs will send any number of transfer
|
|
|
|
requests to the stdin of the transfer process, in a serial sequence. Once a
|
|
|
|
transfer request is sent to the process, it awaits a completion response before
|
|
|
|
sending the next request.
|
2016-07-07 15:43:56 +00:00
|
|
|
|
|
|
|
##### Uploads
|
|
|
|
|
|
|
|
For uploads the request sent from git-lfs to the transfer process will look
|
|
|
|
like this:
|
|
|
|
|
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "upload", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a", "size": 346232, "path": "/path/to/file.png", "action": { "href": "nfs://server/path", "header": { "key": "value" } } }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
* `event`: Always `upload` to identify this message
|
2016-07-07 15:43:56 +00:00
|
|
|
* `oid`: the identifier of the LFS object
|
|
|
|
* `size`: the size of the LFS object
|
|
|
|
* `path`: the file which the transfer process should read the upload data from
|
2017-04-23 08:18:24 +00:00
|
|
|
* `action`: the `upload` action copied from the response from the batch API.
|
|
|
|
This contains `href` and `header` contents, which are named per HTTP
|
2016-07-11 09:56:37 +00:00
|
|
|
conventions, but can be interpreted however the custom transfer agent wishes
|
|
|
|
(this is an NFS example, but it doesn't even have to be an URL). Generally,
|
2017-04-23 08:18:24 +00:00
|
|
|
`href` will give the primary connection details, with `header` containing any
|
2016-07-11 09:56:37 +00:00
|
|
|
miscellaneous information needed.
|
2016-07-07 15:43:56 +00:00
|
|
|
|
2017-04-23 08:08:24 +00:00
|
|
|
The transfer process should post one or more [progress messages](#progress) and
|
2016-07-07 15:43:56 +00:00
|
|
|
then a final completion message as follows:
|
|
|
|
|
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "complete", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a" }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
* `event`: Always `complete` to identify this message
|
2016-07-07 15:43:56 +00:00
|
|
|
* `oid`: the identifier of the LFS object
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
Or if there was an error in the transfer:
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "complete", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a", "error": { "code": 2, "message": "Explain what happened to this transfer" } }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
* `event`: Always `complete` to identify this message
|
2016-07-12 14:06:02 +00:00
|
|
|
* `oid`: the identifier of the LFS object
|
|
|
|
* `error`: Should contain a `code` and `message` explaining the error
|
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
##### Downloads
|
|
|
|
|
|
|
|
For downloads the request sent from git-lfs to the transfer process will look
|
|
|
|
like this:
|
|
|
|
|
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "download", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "size": 21245, "action": { "href": "nfs://server/path", "header": { "key": "value" } } }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
* `event`: Always `download` to identify this message
|
2016-07-07 15:43:56 +00:00
|
|
|
* `oid`: the identifier of the LFS object
|
|
|
|
* `size`: the size of the LFS object
|
2017-04-23 08:18:24 +00:00
|
|
|
* `action`: the `download` action copied from the response from the batch API.
|
|
|
|
This contains `href` and `header` contents, which are named per HTTP
|
2016-07-11 09:56:37 +00:00
|
|
|
conventions, but can be interpreted however the custom transfer agent wishes
|
|
|
|
(this is an NFS example, but it doesn't even have to be an URL). Generally,
|
2017-04-23 08:18:24 +00:00
|
|
|
`href` will give the primary connection details, with `header` containing any
|
2016-07-11 09:56:37 +00:00
|
|
|
miscellaneous information needed.
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2017-04-23 08:08:24 +00:00
|
|
|
Note there is no file path included in the download request; the transfer
|
2016-07-06 16:33:08 +00:00
|
|
|
process should create a file itself and return the path in the final response
|
|
|
|
after completion (see below).
|
|
|
|
|
2017-04-23 08:08:24 +00:00
|
|
|
The transfer process should post one or more [progress messages](#progress) and
|
2016-07-07 15:43:56 +00:00
|
|
|
then a final completion message as follows:
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "complete", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "path": "/path/to/file.png" }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
* `event`: Always `complete` to identify this message
|
2016-07-07 15:43:56 +00:00
|
|
|
* `oid`: the identifier of the LFS object
|
|
|
|
* `path`: the path to a file containing the downloaded data, which the transfer
|
2017-04-23 08:33:16 +00:00
|
|
|
process relinquishes control of to git-lfs. git-lfs will move the file into
|
|
|
|
LFS storage.
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
Or, if there was a failure transferring this item:
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "complete", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "error": { "code": 2, "message": "Explain what happened to this transfer" } }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
2016-07-06 16:33:08 +00:00
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
* `event`: Always `complete` to identify this message
|
2016-07-15 14:37:16 +00:00
|
|
|
* `oid`: the identifier of the LFS object
|
|
|
|
* `error`: Should contain a `code` and `message` explaining the error
|
|
|
|
|
2016-07-06 16:33:08 +00:00
|
|
|
Errors for a single transfer request should not terminate the process. The error
|
|
|
|
should be returned in the response structure instead.
|
|
|
|
|
2016-07-12 10:43:54 +00:00
|
|
|
The custom transfer adapter does not need to check the SHA of the file content
|
|
|
|
it has downloaded, git-lfs will do that before moving the final content into
|
|
|
|
the LFS store.
|
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
##### Progress
|
|
|
|
|
2017-04-23 08:08:24 +00:00
|
|
|
In order to support progress reporting while data is uploading / downloading,
|
2016-07-07 15:43:56 +00:00
|
|
|
the transfer process should post messages to stdout as follows before sending
|
|
|
|
the final completion message:
|
|
|
|
|
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "progress", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "bytesSoFar": 1234, "bytesSinceLast": 64 }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
* `event`: Always `progress` to identify this message
|
2016-07-07 15:43:56 +00:00
|
|
|
* `oid`: the identifier of the LFS object
|
|
|
|
* `bytesSoFar`: the total number of bytes transferred so far
|
2017-04-23 08:08:24 +00:00
|
|
|
* `bytesSinceLast`: the number of bytes transferred since the last progress
|
2016-07-07 15:43:56 +00:00
|
|
|
message
|
|
|
|
|
|
|
|
The transfer process should post these messages such that the last one sent
|
|
|
|
has `bytesSoFar` equal to the file size on success.
|
|
|
|
|
|
|
|
#### Stage 3: Finish & Cleanup
|
2016-07-06 16:33:08 +00:00
|
|
|
|
|
|
|
When all transfers have been processed, git-lfs will send the following message
|
|
|
|
to the stdin of the transfer process:
|
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
```json
|
2017-04-23 08:31:09 +00:00
|
|
|
{ "event": "terminate" }
|
2016-07-07 15:43:56 +00:00
|
|
|
```
|
2016-07-06 16:33:08 +00:00
|
|
|
|
|
|
|
On receiving this message the transfer process should clean up and terminate.
|
|
|
|
No response is expected.
|
|
|
|
|
2016-07-07 15:43:56 +00:00
|
|
|
## Error handling
|
|
|
|
|
2016-07-06 16:33:08 +00:00
|
|
|
Any unexpected fatal errors in the transfer process (not errors specific to a
|
2017-04-23 08:08:24 +00:00
|
|
|
transfer request) should set the exit code to non-zero and print information to
|
2016-07-06 16:33:08 +00:00
|
|
|
stderr. Otherwise the exit code should be 0 even if some transfers failed.
|
|
|
|
|
2016-07-11 09:56:37 +00:00
|
|
|
## A Note On Verify Actions
|
|
|
|
|
2017-04-23 08:18:24 +00:00
|
|
|
You may have noticed that that only the `upload` and `download` actions are
|
|
|
|
passed to the custom transfer agent for processing, what about the `verify`
|
2016-07-11 09:56:37 +00:00
|
|
|
action, if the API returns one?
|
|
|
|
|
2017-04-23 08:08:24 +00:00
|
|
|
Custom transfer agents do not handle the verification process, only the
|
2016-07-11 09:56:37 +00:00
|
|
|
upload and download of content. The verify link is typically used to notify
|
|
|
|
a system *other* than the actual content store after an upload was completed,
|
|
|
|
therefore it makes more sense for that to be handled via the normal API process.
|