Commit Graph

226 Commits

Author SHA1 Message Date
Chris Darroch
d5bef9f99a tq: make panic messages consistent
Several panic() calls in the basic and tus.io transfer
queue adapaters have very similar messages, which we can
make more consistent with each other; this will ease
translation support when we make these messages translatable
in a subsequent commit.
2022-01-29 22:10:47 -08:00
Chris Darroch
0f2f1cdd9f lfs*,locking,tq: revise ancillary error messages
We rephrase some error message strings which are used when
wrapping other errors so they are clearer and more consistent
with other messages.  Note that these strings will be
prepended to the wrapper errors' messages.

In the lfs/diff_index_scanner.go file in particular we
rephrase the additional message to include a full Git
command ("git diff-index"), which is similar to how errors
are reported in the git package.
2022-01-29 22:00:41 -08:00
Chris Darroch
71055f18ec tq,t: remove transfer queue message prefix
One error message output by the transfer queue currently
begins with a "tq:" prefix, which is unlike any other error
messages from that package, or from anywhere else, for that
matter.

We can therefore simplify our message string slightly by
simply removing this prefix from it.

Note that this message is not yet passed as a translation
string, but we will address that issue in a subsequent commit.
2022-01-29 21:59:22 -08:00
Chris Darroch
115d7a8e5b tq: simplify HTTP 403 status code messages
Each of the upload transfer queue adapters reports an error
with a similar message in the case of a 403 response, and all
of these messages begin with an "http:" prefix (even the
one in the SSH adapter).  However, other status codes are not
reported the same way.

This prefix should either be removed from the translatable
message text, or removed entirely; we choose the latter option
here as the simplest.  We also now interpolate the status code
rather than make it a fixed part of the translatable string,
and start the messages with a capital letter.
2022-01-29 21:59:22 -08:00
Chris Darroch
460e5dce90 commands,tq,t: remove Git LFS message prefixes
Several error messages output by the "git lfs track" command and
the transfer queue currently begin with a "Git LFS:" prefix, which
is unlike any other error messages from either that command or
that package, or anywhere else, for that matter.

We can therefore simplify our translation message strings slightly
by simply removing this prefix from them.
2022-01-29 20:21:47 -08:00
brian m. carlson
08774bf091
tq: make strings translatable
Remove the Verb method on the Direction object in favor of a method
formatting the operation in progress.  This is easier to translate and
will prevent sentence fragments from appearing in strings.

Additionally, move a variable down into a function so that we can
translate it, since strings at the top level cannot be translated due to
the locale object not being initialized yet.
2022-01-18 17:38:25 +00:00
Dimitris Apostolou
21b0402690
Fix typos 2022-01-05 08:49:08 +02:00
brian m. carlson
51a5ca3f68
tq: pass hash algorithm during batch requests
During a batch request, pass the hash algorithm we're using to the
remote server, and read the value, if any, that we get back.  If it is
not either absent or the string "sha256", fail, since that means that
the client and server don't agree on the proper hash algorithm.
2021-09-10 14:39:01 +00:00
brian m. carlson
087db1de70
Set package version to v3
Since we're about to do a v3.0.0 release, let's bump the version to v3.

Make this change automatically with the following command to avoid any
missed items:

  git grep -l github.com/git-lfs/git-lfs/v2 | \
  xargs sed -i -e 's!github.com/git-lfs/git-lfs/v2!github.com/git-lfs/git-lfs/v3!g'
2021-09-02 20:41:08 +00:00
Fabio Huser
a3ecbcc7f6 Fix 429 retry-after handling of the LFS batch API endpoint
While the 429 retry-after HTTP error handling for the storage endpoint works
just as expected, the same was not true for batch API endpoint. In case of a
429 error, the call to the batch API endpoint would be retried as well as the
`retry-after` HTTP header honoured correctly, but the LFS command would still
exit with a generic `LFS: Error` message. This was caused by the fact, that
while the retry-able error from the `Batch` method was correctly overwritten by
`nil` in case it of a retry, it would finally again be converted into a
retry-able error and hence be no longer `nil`. This would lead to a bubble-up of
the original 429 retry-able HTTP error to the final error check, altough the
exact operation was successfully retried. Furthermore, the overwrite with
`nil` during correct handling of the first object caused all subsequent objects
to fail and hence never being enqueued for retrial. This was solved by
removing the immediate overwrite by `nil` in case of a retry-able-later
error and doing the final error conversion based on the whole batch
result instead of a single object result.
2021-08-15 10:18:32 +02:00
Chris Darroch
dd8e306e31 all: update go.mod module path with explicit v2
When our go.mod file was introduced in commit
114e85c2002091eb415040923d872f8e4a4bc636 in PR #3208, the module
path chosen did not include a trailing /v2 component.  However,
the Go modules specification now advises that module paths must
have a "major version suffix" which matches the release version.

We therefore add a /v2 suffix to our module path and all its
instances in import paths.

See also https://golang.org/ref/mod#major-version-suffixes for
details regarding the Go module system's major version suffix rule.
2021-08-09 23:18:38 -07:00
Chris Darroch
e1624ca685 tq: drop unused ClearTempStorage adapter method
The ClearTempStorage() method of the transfer Adapter interface,
while implemented for most adapters, is never actually called,
and moreover is dangerous in some its current implementations
if it were ever to be called.

Specifically in the basic upload and download adapters and the
SSH adapter, the ClearTempStorage() method relies on an internal
tempDir() method to return the path to the temporary directory
to be cleared, which ClearTempStorage() then removes entirely.

The tempDir() methods, however, can return os.TempDir() if they
are unable to create a local temporary directory within the
Git LFS storage directory, and also do not cache the path to
the directory they first returned.  So were ClearTempStorage()
ever to be called in one of these adapters, its invocation of
tempDir() might just return the system temporary directory
(e.g., /tmp on Unix), which it would then attempt to remove.

The ClearTempStorage() method was introduced in commits
f124f0585f73279195cb20ecb4aa9e149512b48a and
940a91a8df18eaf16e9a0162999f26d59fa9f5af of PR #1265, but
has never been called by any user of the transfer
adapter interface.

We therefore just remove this method and its implemetations
from all the current tranfer adapters.
2021-07-20 23:51:32 -04:00
Chris Darroch
461af6e0a4 tq/ssh.go: remove extra temp filename comment
We remove one extra copy of a comment added in commit
9c46a38281283d919e841cf3dcae35b4f686e0e1 of PR #4446,
where the first copy was already in place from commit
594f8e386cce3441e06c9094ab5e251f0e07ca1f in the same PR.
2021-07-20 23:35:39 -04:00
Chris Darroch
832be8daaa tq/transfer.go: fix comment for Begin method
We update the comment describing the Begin method of the
Adapter interface to match that for the corresponding method
of the SSH transfer adapter, which was added in commit
594f8e386cce3441e06c9094ab5e251f0e07ca1f of PR #4446.
The older comment for the interface description has been
out of sync with the actual method signature since at least
commit 303156e0e22ef7cdd863605fbb501174dc25e08a of PR #1774,
when the maxConcurrency integer argument was replaced with
the current "cfg" AdapterConfig one.
2021-07-20 23:29:25 -04:00
brian m. carlson
31d3fb7cee
lfsapi: move SSHTransfer instantiation to lfsapi client
The lfsapi client is used to perform operations for SSH authentication
already, so we know we'll have one wherever SSH operations might be
done.  Let's move the instantiation to the client so that we can reuse
it both in the transfer queue code and in the locking code as well.
2021-07-20 19:15:59 +00:00
brian m. carlson
9c46a38281
ssh: support concurrent transfers using the pure SSH protocol
When using the pure SSH-based protocol, we can get much higher speeds by
multiplexing multiple connections on the same SSH connection.  If we're
using OpenSSH, let's enable the ControlMaster option unless
lfs.ssh.automultiplex is set to false, and multiplex these shell
operations over one connection.

We prefer XDG_RUNTIME_DIR because it's guaranteed to be private and we
can share many connections over one socket, but if that's not set, let's
default to creating a new temporary directory for the socket.  On
Windows, where the native SSH client doesn't support ControlMaster,
we should fall back to using multiple connections since we use
ControlMaster=auto.

Note that the option exists because users may already be using SSH
multiplexing and we would want to provide a way for them to disable
this, in addition to the case where users have an old or broken OpenSSH
which cannot support this option.

We pass the connection object into each worker and adjust our transfer
code to pass it into each function we invoke.  We also make sure to
properly terminate each connection at the end by reducing our connection
count to 0, which closes the extra (i.e., all) connections.

Co-authored-by: Chris Darroch <chrisd8088@github.com>
2021-07-20 19:15:59 +00:00
brian m. carlson
898dc43d1d
ssh: support multiple connections for one transfer
Since we currently support multiple connections for HTTPS, let's also
add multiple connections for pure SSH connections.  For now, we spawn a
whole new connection, but in the future we'll support using OpenSSH's
ControlMaster flag.

Introduce several new functions to create SSH connections and adjust the
number of connection being used.
2021-07-20 19:15:59 +00:00
brian m. carlson
594f8e386c
tq: implement a pure SSH-based protocol for transfers
A pure SSH-based protocol has been a long time request from many users,
so let's implement one.  Implement basic upload and download support,
plus batch requests, and support pkt-line tracing for text packets to
make debugging easier.

Note that locking is not yet supported; that will come in a future
patch.

We prefer the Endpoint function to the RemoteEndpoint function because
the former handles lfs.url and the latter does not.

Update a comment about the shared temporary directory; it is no longer
cleared automatically by the adapter, and instead the cleanup happens
later in other code.  Therefore, it is safe to share the directory among
the transport adapters.

Co-authored-by: Chris Darroch <chrisd8088@github.com>
2021-07-20 19:15:40 +00:00
brian m. carlson
3e5172123e
tq: turn the batch client into an interface
Right now, we always perform a batch operation using an instance of the
transfer queue client, which is HTTP based.  However, in the future,
we'll want to add an SSH-based option, so let's turn the client into a
simple interface we can use to abstract this away.
2021-07-20 18:39:04 +00:00
brian m. carlson
dcfd29419e
Remove NTLM support
Our NTLM support has been known to be broken in various situations for a
while, specifically on Windows.  The core team is unable to troubleshoot
these problems, and nobody has stepped up to maintain the NTLM support.
In addition, NTLM uses cryptography and security techniques that are
known to be insecure, such as the algorithms DES, MD4, and MD5, as well
as simple, unsalted hashes of passwords.

Since we now support Kerberos, most users should be able to replace
their use of NTLM with Kerberos instead.  Users have reported this
working on Windows and it is known to work well on at least Debian as
well.  Drop support for NTLM and remove it from the codebase.
2021-02-02 16:41:41 +00:00
Billy Keyes
cf022164c6 tq: add exponential backoff for retries
Previously, retries happened as fast as possible unless the server
provided the Retry-After header. This is effective for certain types of
errors, but not when the LFS server is experiencing a temporary but not
instantaneous failure. Delaying between retries lets the server recover
and the LFS operation complete.

Delays start at a fixed 250ms for the first retry and double with each
successive retry up to a configurable maximum delay, 10s by default. The
maximum retry is configurable using lfs.transfer.maxretrydelay. Delays
can be disabled by setting the max delay to 0.
2020-04-15 14:10:16 -07:00
vend_natalie.chen
d7d9fd711a fix upload retry 'file already closed' issue' 2020-02-26 11:13:54 +08:00
brian m. carlson
a514c7cf43
Check error when creating local storage directory
When we have an error creating the local storage directory, such as if
the permissions are wrong, we fail to produce a helpful error message,
since we fail to check the error we produce.

Let's take the error we get in such a case and pass it through to the
transfer queue, where we can handle it as any other error.  This means
that if only one directory has a problem, we can transfer the rest of
the objects and fail for only the one problematic object.
2020-02-10 16:07:59 +00:00
brian m. carlson
6d29072003
creds: move Access types into creds package
In the future, we're going to need to access the access-related types
in the lfshttp package.  To avoid an import loop, move Access and
AccessMode into the creds package.  Add constructors and accessors since
the members are private.
2019-12-09 15:35:52 +00:00
brian m. carlson
686939a3ab
Merge branch 'master' into crash-on-lfs-checkout 2019-12-06 20:25:33 +00:00
brian m. carlson
1157c0df53
tq: retry batch failures
If the batch operation fails due to an error instead of a bad HTTP
status code, we'll abort the batch operation and retry.  This appears to
be a regression from 1412d6e4 ("Don't fail if we lack objects the server
has", 2019-04-30), which caused us to handle errors differently.

Since there are two error returns from enqueueAndCollectRetriesFor,
let's wrap the batch error case as a retriable error and not abort if we
find a retriable error later on.  This lets us continue to abort if we
get a missing object, which should be fatal, but retry in the more
common network failure case.
2019-12-06 19:28:42 +00:00
Daisuke Ban
a38f8d54bf Store temporary file name for removing on defer block. 2019-12-06 10:50:14 +09:00
Daisuke Ban
75cbe38ead Add nil-check on defer block of DoTransfer()
- If os.OpenFile() returns error then variable "f" is nil, so support this case
2019-12-02 15:39:47 +09:00
brian m. carlson
28f9b73d2f
Honor lfs.url when deciding on transfer adapters
Currently, if you have a remote with a file URL but have lfs.url set to
point to a non-file URL, your transfers will fail.  This occurs because
we look up whether to use a transfer adapter based on the remote URL,
but we also look up whether to default to the built-in transfer adapter
based on the remote URL, not the endpoint URL.  We then attempt to pass
the HTTP URL to the standalone transfer adapter, which doesn't work.

Look up the endpoint URL as well and pass that to the code which
determines whether we should use the builtin transfer adapter.  If the
actual endpoint is not a file URL, then we won't suggest the builtin
transfer adapter.
2019-11-06 22:23:22 +00:00
Marat Radchenko
662a624819 Implement retry logic to fix LFS storage race conditions on Windows
Testing showed that while race condition analysis in #3880 was correct, the way it tries to fix that
does not work for the *first* git-lfs process that will actually perform file move.

Instead, this commit performs multiple attempts when working with files in LFS storage.

Similar logic is already implemented in "cmd/go/internal/robustio" and "cmd/go/internal/renameio" packages.
However, they are not public, so we cannot use them.
2019-11-05 17:30:13 +03:00
Marat Radchenko
285eebdddf Revert "Stop replacing files in LFS storage when downloading them concurrently on Windows"
This reverts commit 0c8edfc097a8364819649c9ba4df2221082a8cf8.
2019-11-05 17:30:13 +03:00
Marat Radchenko
0c8edfc097 Stop replacing files in LFS storage when downloading them concurrently on Windows
On Windows, there is no way to replace file atomically. Instead, MoveFileExA:

1. Calls CreateFileA(access=Delete, shareMode=Delete)
2. Calls SetRenameInformationFile to rename file
3. Calls CloseFile

The problem is that if parallel process attempts to open destination file for reading between
steps 2 and 3, it will try to do that without shareMode=Delete and will hit a SHARING_VIOLATION error.

In practice, this race condition results in:

  Smudge error: Error opening media file.: open .git\lfs\objects\<sha>: The process cannot access the file because it is being used by another process.

There are two solutions here:

 1. Do not overwrite file if it already exists
 2. Retry reading from file if got a SHARING_VIOLATION

This commit implements option 1.

Fixes #2825 (for Windows, other OSes are race-free already). Also see #3813 and #3826.
2019-10-25 10:54:25 +03:00
Marat Radchenko
482260c7e3 Fix error strings to follow Go guidelines
Error strings should not be capitalized (unless beginning with proper nouns or acronyms) or end with punctuation:
https://github.com/golang/go/wiki/CodeReviewComments#error-strings
2019-10-22 17:33:49 +03:00
Marat Radchenko
1edb976a92 More robust handling of parallel attempts to download the same file
1. git-lfs now only writes to unique temp files created with `ioutil.TempFile`
   that are open with `O_CREATE|O_EXCL`
2. Partially-downloaded file is now atomically borrowed and returned back via `os.Rename`
3. `.part <-> .tmp` and `.tmp -> final` renames are allowed to fail and are handled appropriately

This is a continuation of #3813
Fixes #2825

There are several error codepaths where we borrow .part file but remove it instead of returning back.
I believe that it is OK and in those erroneous cases it is better to restart download from scratch
instead of attempting to use possibly-corrupt .part file.
2019-09-26 22:21:11 +03:00
Marat Radchenko
01bc6a687e Do not fail when multiple processes download the same lfs file
When several git-lfs processes use the same LFS storage simultaneously,
there's a race condition between checking whether LFS file exists and renaming temporary
file to final name in LFS storage. When rename fails, check that target file exists.
If it does, pretend that download has finished successfully.

Fixes #2825
2019-09-10 13:05:28 +03:00
brian m. carlson
eb83fcda24
Avoid deadlock when transfer queue fails
In 1412d6e4 ("Don't fail if we lack objects the server has",
2019-04-30), we changed the code to abort later if a missing object
occurs.  In doing so, we had to consider the case where the transfer
queue aborts early for some reason and ensure that the sync.WaitGroup
does not unnecessarily block due to outstanding objects never getting
processed.

However, the approach we used, which was to explicitly add the number of
items we skipped processing, was error prone and didn't cover all cases.
Notably, a DNS failure could randomly cause a hang during a push.  Solve
this by creating a class for a wait group which is abortable and simply
abort it if we encounter an error, preventing any deadlocks caused by
miscounting the number of items.
2019-09-09 17:06:59 +00:00
brian m. carlson
28edeb7d43
tq: avoid a hang when Git is slow to provide us data
When we are cloning or checking out a repo, the filter-process smudge
filter sends the transfer queue data by calling the Add method.  This
method inserts items into the incoming channel, which is then processed
by the collectBatches function.  This function accepts items, downloads
a batch worth, and splits the remainder into the next and pending
queues, and then, if there are any items in either of these queues, goes
around again.  If there are no items, it exits the download queue.

Unfortunately, this algorithm has a problem.  If the filter-process
portion is slow and does not provide us enough data, we can end up with
both the next and pending queues empty, but the incoming channel still
open, waiting for the filter-process end to send more data.  In such a
case, we stop downloading early and the process deadlocks since there's
nobody to read from the channel.

The solution, however, is simple: since we set the closing flag whenever
the incoming channel is closed, we can check if that flag is set and
only quit if it is.  We know that if it is not, there's more data, and
the other end of the channel is just being slow (perhaps because Git is
processing other filters).  Do this to avoid the deadlock and ensure
that we terminate properly in all cases.
2019-09-09 16:13:18 +00:00
brian m. carlson
bb05cf5053
Provide support for file URLs via a transfer agent
One commonly requested feature for Git LFS is support for local files.
Currently, we tell users that they must use a standalone transfer
agent, which is true, but nobody has provided one yet. Since writing a
simple transfer agent is not very difficult, let's provide one
ourselves.

Introduce a basic standalone transfer agent, git lfs standalone-file,
that handles uploads and downloads. Add a default configuration required
for it to work, while still allowing users to override this
configuration if they have a preferred implementation that is more
featureful. We provide this as a transfer agent instead of built-in
because it avoids the complexity of adding a different code path to the
main codebase, but also serves as a demonstration of how to write a
standalone transfer agent for others who might want to do so, much
like Git demonstrates remote helpers using its HTTP helper.
2019-08-02 17:23:47 +00:00
Kazuki MATSUDA / 松田一樹
4bc322352f
NON-ISSUE Update deprecated SEEK_SET, SEEK_CUR usage.
os.SEEK_XXX is not deprecated and we can use io.SeekXxx instead.

https://golang.org/pkg/os/#pkg-constants

```go
// Seek whence values.
//
// Deprecated: Use io.SeekStart, io.SeekCurrent, and io.SeekEnd.
const (
	SEEK_SET int = 0 // seek relative to the origin of the file
	SEEK_CUR int = 1 // seek relative to the current offset
	SEEK_END int = 2 // seek relative to the end
)
```

https://golang.org/pkg/io/#pkg-constants

```go
// Seek whence values.
const (
	SeekStart   = 0 // seek relative to the origin of the file
	SeekCurrent = 1 // seek relative to the current offset
	SeekEnd     = 2 // seek relative to the end
)
```
2019-07-27 12:48:42 +09:00
Kitten King
f5f712b4de
Fix Typos 2019-07-24 07:17:40 +00:00
brian m. carlson
1412d6e47a
Don't fail if we lack objects the server has
A Git LFS client may not have the entire history of the objects for the
repository. However, in some situations, we traverse the entire history
of a branch when pushing it, meaning that we need to process every
LFS object in the history of that branch. If the objects for the entire
history are not present, we currently fail to push.

Instead, let's mark objects we don't have on disk as missing and only
fail when we would need to upload those objects. We'll know the server
has the objects if the batch response provides no actions to take for
them when we request an upload. Pass the missing flag down through the
code, and always set it to false for non-uploads.

If for some reason we fail to properly flag a missing object, we will
still fail later on when we cannot open the file, just in a messier and
more poorly controlled way. The technique used here will attempt to
abort the batch as soon as we notice a problem, which means that in the
common case (less than 100 objects) we won't have transferred any
objects, so the user can notice the failure as soon as possible.

Update the tests to look for a string which will occur in the error
message, since we no longer produce the system error message for ENOENT.
2019-07-15 20:47:04 +00:00
brian m. carlson
61bc880d9e
tq: ensure we pass the correct Accept header in verify requests
When we perform a verify request, we need to specify a proper Accept
header along with the Content-Type header. Do so.

Hoist the setting of the headers before the code which copies headers
from the remote side, so that the remote side can override it with a
suitable header if necessary. This is required for compatibility with
some existing servers.
2019-05-22 21:01:54 +00:00
Hidetoshi Hirokawa
0f1e9b7523 tq/adapterbase: fix typo enableHrefRerite to enableHrefRewrite 2019-04-04 11:04:00 +09:00
Hidetoshi Hirokawa
a935aec8ab tq/adapterbase: add lfs.transfer.enablehrefrewrite config
For backward compatibility and some incompatible situations such as
using SSH protocol, href-rewriting is only enabled if
lfs.transfer.enablehrefrewrite is set to true explicitly.
2019-04-03 13:01:37 +09:00
Hidetoshi Hirokawa
7da4901957 tq/adapterbase: support rewriting href
Use insteadOf/pushInsteadOf aliases to rewrite href to upload/download
LFS objects. This is useful in situations such like you need to access
the LFS server via a reverse proxy.
2019-04-02 17:48:21 +09:00
brian m. carlson
73e8a713aa
tq: retry on oversize file
If we're attempting to resume a download but the file we have on disk is
the expected size or too large, we'll send an invalid Range header where
the lower bound is greater than the upper bound. If this situation
occurs, simply retry the download from the beginning, since we clearly
don't have the expected data (or we wouldn't be resuming) and it isn't
clear how else to recover.
2019-02-22 14:46:10 +00:00
brian m. carlson
3a278dbc49
tq: avoid nil pointer dereference on download failure
In 4775f9c1 ("tq: avoid nil pointer dereference on unexpected failure",
2019-02-19), we fixed an issue on uploads where we could dereference a
nil http.Response pointer when uploading. We should also check for this
when downloading. Since the check a few lines below is now redundant,
remove it.
2019-02-20 20:42:14 +00:00
brian m. carlson
4775f9c19c
tq: avoid nil pointer dereference on unexpected failure
When we get an error making an API request, we check to see if the
response is a 429, and if so, we attempt to back off and retry later.
However, we failed to check if we actually had a response; if the
response was missing (say, because we had a network issue), then we
would dereference a nil pointer.

Instead, let's explicitly check for a non-nil response before querying
it for a status code.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
2019-02-20 14:47:35 +00:00
Zac Romero
4fe0a8db4f Add integration tests; check other places where 429 could occur 2019-01-07 12:27:02 +03:00
Zac Romero
2197f76dde Add support for retries with delays (ex rate limiting) 2018-12-26 16:30:53 +01:00