git-lfs

Author	SHA1	Message	Date
Chris Darroch	d5bef9f99a	tq: make panic messages consistent Several panic() calls in the basic and tus.io transfer queue adapaters have very similar messages, which we can make more consistent with each other; this will ease translation support when we make these messages translatable in a subsequent commit.	2022-01-29 22:10:47 -08:00
Chris Darroch	0f2f1cdd9f	lfs*,locking,tq: revise ancillary error messages We rephrase some error message strings which are used when wrapping other errors so they are clearer and more consistent with other messages. Note that these strings will be prepended to the wrapper errors' messages. In the lfs/diff_index_scanner.go file in particular we rephrase the additional message to include a full Git command ("git diff-index"), which is similar to how errors are reported in the git package.	2022-01-29 22:00:41 -08:00
Chris Darroch	71055f18ec	tq,t: remove transfer queue message prefix One error message output by the transfer queue currently begins with a "tq:" prefix, which is unlike any other error messages from that package, or from anywhere else, for that matter. We can therefore simplify our message string slightly by simply removing this prefix from it. Note that this message is not yet passed as a translation string, but we will address that issue in a subsequent commit.	2022-01-29 21:59:22 -08:00
Chris Darroch	115d7a8e5b	tq: simplify HTTP 403 status code messages Each of the upload transfer queue adapters reports an error with a similar message in the case of a 403 response, and all of these messages begin with an "http:" prefix (even the one in the SSH adapter). However, other status codes are not reported the same way. This prefix should either be removed from the translatable message text, or removed entirely; we choose the latter option here as the simplest. We also now interpolate the status code rather than make it a fixed part of the translatable string, and start the messages with a capital letter.	2022-01-29 21:59:22 -08:00
Chris Darroch	460e5dce90	commands,tq,t: remove Git LFS message prefixes Several error messages output by the "git lfs track" command and the transfer queue currently begin with a "Git LFS:" prefix, which is unlike any other error messages from either that command or that package, or anywhere else, for that matter. We can therefore simplify our translation message strings slightly by simply removing this prefix from them.	2022-01-29 20:21:47 -08:00
brian m. carlson	08774bf091	tq: make strings translatable Remove the Verb method on the Direction object in favor of a method formatting the operation in progress. This is easier to translate and will prevent sentence fragments from appearing in strings. Additionally, move a variable down into a function so that we can translate it, since strings at the top level cannot be translated due to the locale object not being initialized yet.	2022-01-18 17:38:25 +00:00
Dimitris Apostolou	21b0402690	Fix typos	2022-01-05 08:49:08 +02:00
brian m. carlson	51a5ca3f68	tq: pass hash algorithm during batch requests During a batch request, pass the hash algorithm we're using to the remote server, and read the value, if any, that we get back. If it is not either absent or the string "sha256", fail, since that means that the client and server don't agree on the proper hash algorithm.	2021-09-10 14:39:01 +00:00
brian m. carlson	087db1de70	Set package version to v3 Since we're about to do a v3.0.0 release, let's bump the version to v3. Make this change automatically with the following command to avoid any missed items: git grep -l github.com/git-lfs/git-lfs/v2 \| \ xargs sed -i -e 's!github.com/git-lfs/git-lfs/v2!github.com/git-lfs/git-lfs/v3!g'	2021-09-02 20:41:08 +00:00
Fabio Huser	a3ecbcc7f6	Fix 429 retry-after handling of the LFS batch API endpoint While the 429 retry-after HTTP error handling for the storage endpoint works just as expected, the same was not true for batch API endpoint. In case of a 429 error, the call to the batch API endpoint would be retried as well as the `retry-after` HTTP header honoured correctly, but the LFS command would still exit with a generic `LFS: Error` message. This was caused by the fact, that while the retry-able error from the `Batch` method was correctly overwritten by `nil` in case it of a retry, it would finally again be converted into a retry-able error and hence be no longer `nil`. This would lead to a bubble-up of the original 429 retry-able HTTP error to the final error check, altough the exact operation was successfully retried. Furthermore, the overwrite with `nil` during correct handling of the first object caused all subsequent objects to fail and hence never being enqueued for retrial. This was solved by removing the immediate overwrite by `nil` in case of a retry-able-later error and doing the final error conversion based on the whole batch result instead of a single object result.	2021-08-15 10:18:32 +02:00
Chris Darroch	dd8e306e31	all: update go.mod module path with explicit v2 When our go.mod file was introduced in commit 114e85c2002091eb415040923d872f8e4a4bc636 in PR #3208, the module path chosen did not include a trailing /v2 component. However, the Go modules specification now advises that module paths must have a "major version suffix" which matches the release version. We therefore add a /v2 suffix to our module path and all its instances in import paths. See also https://golang.org/ref/mod#major-version-suffixes for details regarding the Go module system's major version suffix rule.	2021-08-09 23:18:38 -07:00
Chris Darroch	e1624ca685	tq: drop unused ClearTempStorage adapter method The ClearTempStorage() method of the transfer Adapter interface, while implemented for most adapters, is never actually called, and moreover is dangerous in some its current implementations if it were ever to be called. Specifically in the basic upload and download adapters and the SSH adapter, the ClearTempStorage() method relies on an internal tempDir() method to return the path to the temporary directory to be cleared, which ClearTempStorage() then removes entirely. The tempDir() methods, however, can return os.TempDir() if they are unable to create a local temporary directory within the Git LFS storage directory, and also do not cache the path to the directory they first returned. So were ClearTempStorage() ever to be called in one of these adapters, its invocation of tempDir() might just return the system temporary directory (e.g., /tmp on Unix), which it would then attempt to remove. The ClearTempStorage() method was introduced in commits f124f0585f73279195cb20ecb4aa9e149512b48a and 940a91a8df18eaf16e9a0162999f26d59fa9f5af of PR #1265, but has never been called by any user of the transfer adapter interface. We therefore just remove this method and its implemetations from all the current tranfer adapters.	2021-07-20 23:51:32 -04:00
Chris Darroch	461af6e0a4	tq/ssh.go: remove extra temp filename comment We remove one extra copy of a comment added in commit 9c46a38281283d919e841cf3dcae35b4f686e0e1 of PR #4446, where the first copy was already in place from commit 594f8e386cce3441e06c9094ab5e251f0e07ca1f in the same PR.	2021-07-20 23:35:39 -04:00
Chris Darroch	832be8daaa	tq/transfer.go: fix comment for Begin method We update the comment describing the Begin method of the Adapter interface to match that for the corresponding method of the SSH transfer adapter, which was added in commit 594f8e386cce3441e06c9094ab5e251f0e07ca1f of PR #4446. The older comment for the interface description has been out of sync with the actual method signature since at least commit 303156e0e22ef7cdd863605fbb501174dc25e08a of PR #1774, when the maxConcurrency integer argument was replaced with the current "cfg" AdapterConfig one.	2021-07-20 23:29:25 -04:00
brian m. carlson	31d3fb7cee	lfsapi: move SSHTransfer instantiation to lfsapi client The lfsapi client is used to perform operations for SSH authentication already, so we know we'll have one wherever SSH operations might be done. Let's move the instantiation to the client so that we can reuse it both in the transfer queue code and in the locking code as well.	2021-07-20 19:15:59 +00:00
brian m. carlson	9c46a38281	ssh: support concurrent transfers using the pure SSH protocol When using the pure SSH-based protocol, we can get much higher speeds by multiplexing multiple connections on the same SSH connection. If we're using OpenSSH, let's enable the ControlMaster option unless lfs.ssh.automultiplex is set to false, and multiplex these shell operations over one connection. We prefer XDG_RUNTIME_DIR because it's guaranteed to be private and we can share many connections over one socket, but if that's not set, let's default to creating a new temporary directory for the socket. On Windows, where the native SSH client doesn't support ControlMaster, we should fall back to using multiple connections since we use ControlMaster=auto. Note that the option exists because users may already be using SSH multiplexing and we would want to provide a way for them to disable this, in addition to the case where users have an old or broken OpenSSH which cannot support this option. We pass the connection object into each worker and adjust our transfer code to pass it into each function we invoke. We also make sure to properly terminate each connection at the end by reducing our connection count to 0, which closes the extra (i.e., all) connections. Co-authored-by: Chris Darroch <chrisd8088@github.com>	2021-07-20 19:15:59 +00:00
brian m. carlson	898dc43d1d	ssh: support multiple connections for one transfer Since we currently support multiple connections for HTTPS, let's also add multiple connections for pure SSH connections. For now, we spawn a whole new connection, but in the future we'll support using OpenSSH's ControlMaster flag. Introduce several new functions to create SSH connections and adjust the number of connection being used.	2021-07-20 19:15:59 +00:00
brian m. carlson	594f8e386c	tq: implement a pure SSH-based protocol for transfers A pure SSH-based protocol has been a long time request from many users, so let's implement one. Implement basic upload and download support, plus batch requests, and support pkt-line tracing for text packets to make debugging easier. Note that locking is not yet supported; that will come in a future patch. We prefer the Endpoint function to the RemoteEndpoint function because the former handles lfs.url and the latter does not. Update a comment about the shared temporary directory; it is no longer cleared automatically by the adapter, and instead the cleanup happens later in other code. Therefore, it is safe to share the directory among the transport adapters. Co-authored-by: Chris Darroch <chrisd8088@github.com>	2021-07-20 19:15:40 +00:00
brian m. carlson	3e5172123e	tq: turn the batch client into an interface Right now, we always perform a batch operation using an instance of the transfer queue client, which is HTTP based. However, in the future, we'll want to add an SSH-based option, so let's turn the client into a simple interface we can use to abstract this away.	2021-07-20 18:39:04 +00:00
brian m. carlson	dcfd29419e	Remove NTLM support Our NTLM support has been known to be broken in various situations for a while, specifically on Windows. The core team is unable to troubleshoot these problems, and nobody has stepped up to maintain the NTLM support. In addition, NTLM uses cryptography and security techniques that are known to be insecure, such as the algorithms DES, MD4, and MD5, as well as simple, unsalted hashes of passwords. Since we now support Kerberos, most users should be able to replace their use of NTLM with Kerberos instead. Users have reported this working on Windows and it is known to work well on at least Debian as well. Drop support for NTLM and remove it from the codebase.	2021-02-02 16:41:41 +00:00
Billy Keyes	cf022164c6	tq: add exponential backoff for retries Previously, retries happened as fast as possible unless the server provided the Retry-After header. This is effective for certain types of errors, but not when the LFS server is experiencing a temporary but not instantaneous failure. Delaying between retries lets the server recover and the LFS operation complete. Delays start at a fixed 250ms for the first retry and double with each successive retry up to a configurable maximum delay, 10s by default. The maximum retry is configurable using lfs.transfer.maxretrydelay. Delays can be disabled by setting the max delay to 0.	2020-04-15 14:10:16 -07:00
vend_natalie.chen	d7d9fd711a	fix upload retry 'file already closed' issue'	2020-02-26 11:13:54 +08:00
brian m. carlson	a514c7cf43	Check error when creating local storage directory When we have an error creating the local storage directory, such as if the permissions are wrong, we fail to produce a helpful error message, since we fail to check the error we produce. Let's take the error we get in such a case and pass it through to the transfer queue, where we can handle it as any other error. This means that if only one directory has a problem, we can transfer the rest of the objects and fail for only the one problematic object.	2020-02-10 16:07:59 +00:00
brian m. carlson	6d29072003	creds: move Access types into creds package In the future, we're going to need to access the access-related types in the lfshttp package. To avoid an import loop, move Access and AccessMode into the creds package. Add constructors and accessors since the members are private.	2019-12-09 15:35:52 +00:00
brian m. carlson	686939a3ab	Merge branch 'master' into crash-on-lfs-checkout	2019-12-06 20:25:33 +00:00
brian m. carlson	1157c0df53	tq: retry batch failures If the batch operation fails due to an error instead of a bad HTTP status code, we'll abort the batch operation and retry. This appears to be a regression from 1412d6e4 ("Don't fail if we lack objects the server has", 2019-04-30), which caused us to handle errors differently. Since there are two error returns from enqueueAndCollectRetriesFor, let's wrap the batch error case as a retriable error and not abort if we find a retriable error later on. This lets us continue to abort if we get a missing object, which should be fatal, but retry in the more common network failure case.	2019-12-06 19:28:42 +00:00
Daisuke Ban	a38f8d54bf	Store temporary file name for removing on defer block.	2019-12-06 10:50:14 +09:00
Daisuke Ban	75cbe38ead	Add nil-check on defer block of DoTransfer() - If os.OpenFile() returns error then variable "f" is nil, so support this case	2019-12-02 15:39:47 +09:00
brian m. carlson	28f9b73d2f	Honor lfs.url when deciding on transfer adapters Currently, if you have a remote with a file URL but have lfs.url set to point to a non-file URL, your transfers will fail. This occurs because we look up whether to use a transfer adapter based on the remote URL, but we also look up whether to default to the built-in transfer adapter based on the remote URL, not the endpoint URL. We then attempt to pass the HTTP URL to the standalone transfer adapter, which doesn't work. Look up the endpoint URL as well and pass that to the code which determines whether we should use the builtin transfer adapter. If the actual endpoint is not a file URL, then we won't suggest the builtin transfer adapter.	2019-11-06 22:23:22 +00:00
Marat Radchenko	662a624819	Implement retry logic to fix LFS storage race conditions on Windows Testing showed that while race condition analysis in #3880 was correct, the way it tries to fix that does not work for the first git-lfs process that will actually perform file move. Instead, this commit performs multiple attempts when working with files in LFS storage. Similar logic is already implemented in "cmd/go/internal/robustio" and "cmd/go/internal/renameio" packages. However, they are not public, so we cannot use them.	2019-11-05 17:30:13 +03:00
Marat Radchenko	285eebdddf	Revert "Stop replacing files in LFS storage when downloading them concurrently on Windows" This reverts commit 0c8edfc097a8364819649c9ba4df2221082a8cf8.	2019-11-05 17:30:13 +03:00
Marat Radchenko	0c8edfc097	Stop replacing files in LFS storage when downloading them concurrently on Windows On Windows, there is no way to replace file atomically. Instead, MoveFileExA: 1. Calls CreateFileA(access=Delete, shareMode=Delete) 2. Calls SetRenameInformationFile to rename file 3. Calls CloseFile The problem is that if parallel process attempts to open destination file for reading between steps 2 and 3, it will try to do that without shareMode=Delete and will hit a SHARING_VIOLATION error. In practice, this race condition results in: Smudge error: Error opening media file.: open .git\lfs\objects\<sha>: The process cannot access the file because it is being used by another process. There are two solutions here: 1. Do not overwrite file if it already exists 2. Retry reading from file if got a SHARING_VIOLATION This commit implements option 1. Fixes #2825 (for Windows, other OSes are race-free already). Also see #3813 and #3826.	2019-10-25 10:54:25 +03:00
Marat Radchenko	482260c7e3	Fix error strings to follow Go guidelines Error strings should not be capitalized (unless beginning with proper nouns or acronyms) or end with punctuation: https://github.com/golang/go/wiki/CodeReviewComments#error-strings	2019-10-22 17:33:49 +03:00
Marat Radchenko	1edb976a92	More robust handling of parallel attempts to download the same file 1. git-lfs now only writes to unique temp files created with `ioutil.TempFile` that are open with `O_CREATE\|O_EXCL` 2. Partially-downloaded file is now atomically borrowed and returned back via `os.Rename` 3. `.part <-> .tmp` and `.tmp -> final` renames are allowed to fail and are handled appropriately This is a continuation of #3813 Fixes #2825 There are several error codepaths where we borrow .part file but remove it instead of returning back. I believe that it is OK and in those erroneous cases it is better to restart download from scratch instead of attempting to use possibly-corrupt .part file.	2019-09-26 22:21:11 +03:00
Marat Radchenko	01bc6a687e	Do not fail when multiple processes download the same lfs file When several git-lfs processes use the same LFS storage simultaneously, there's a race condition between checking whether LFS file exists and renaming temporary file to final name in LFS storage. When rename fails, check that target file exists. If it does, pretend that download has finished successfully. Fixes #2825	2019-09-10 13:05:28 +03:00
brian m. carlson	eb83fcda24	Avoid deadlock when transfer queue fails In 1412d6e4 ("Don't fail if we lack objects the server has", 2019-04-30), we changed the code to abort later if a missing object occurs. In doing so, we had to consider the case where the transfer queue aborts early for some reason and ensure that the sync.WaitGroup does not unnecessarily block due to outstanding objects never getting processed. However, the approach we used, which was to explicitly add the number of items we skipped processing, was error prone and didn't cover all cases. Notably, a DNS failure could randomly cause a hang during a push. Solve this by creating a class for a wait group which is abortable and simply abort it if we encounter an error, preventing any deadlocks caused by miscounting the number of items.	2019-09-09 17:06:59 +00:00
brian m. carlson	28edeb7d43	tq: avoid a hang when Git is slow to provide us data When we are cloning or checking out a repo, the filter-process smudge filter sends the transfer queue data by calling the Add method. This method inserts items into the incoming channel, which is then processed by the collectBatches function. This function accepts items, downloads a batch worth, and splits the remainder into the next and pending queues, and then, if there are any items in either of these queues, goes around again. If there are no items, it exits the download queue. Unfortunately, this algorithm has a problem. If the filter-process portion is slow and does not provide us enough data, we can end up with both the next and pending queues empty, but the incoming channel still open, waiting for the filter-process end to send more data. In such a case, we stop downloading early and the process deadlocks since there's nobody to read from the channel. The solution, however, is simple: since we set the closing flag whenever the incoming channel is closed, we can check if that flag is set and only quit if it is. We know that if it is not, there's more data, and the other end of the channel is just being slow (perhaps because Git is processing other filters). Do this to avoid the deadlock and ensure that we terminate properly in all cases.	2019-09-09 16:13:18 +00:00
brian m. carlson	bb05cf5053	Provide support for file URLs via a transfer agent One commonly requested feature for Git LFS is support for local files. Currently, we tell users that they must use a standalone transfer agent, which is true, but nobody has provided one yet. Since writing a simple transfer agent is not very difficult, let's provide one ourselves. Introduce a basic standalone transfer agent, git lfs standalone-file, that handles uploads and downloads. Add a default configuration required for it to work, while still allowing users to override this configuration if they have a preferred implementation that is more featureful. We provide this as a transfer agent instead of built-in because it avoids the complexity of adding a different code path to the main codebase, but also serves as a demonstration of how to write a standalone transfer agent for others who might want to do so, much like Git demonstrates remote helpers using its HTTP helper.	2019-08-02 17:23:47 +00:00
Kazuki MATSUDA / 松田一樹	4bc322352f	NON-ISSUE Update deprecated SEEK_SET, SEEK_CUR usage. os.SEEK_XXX is not deprecated and we can use io.SeekXxx instead. https://golang.org/pkg/os/#pkg-constants ```go // Seek whence values. // // Deprecated: Use io.SeekStart, io.SeekCurrent, and io.SeekEnd. const ( SEEK_SET int = 0 // seek relative to the origin of the file SEEK_CUR int = 1 // seek relative to the current offset SEEK_END int = 2 // seek relative to the end ) ``` https://golang.org/pkg/io/#pkg-constants ```go // Seek whence values. const ( SeekStart = 0 // seek relative to the origin of the file SeekCurrent = 1 // seek relative to the current offset SeekEnd = 2 // seek relative to the end ) ```	2019-07-27 12:48:42 +09:00
Kitten King	f5f712b4de	Fix Typos	2019-07-24 07:17:40 +00:00
brian m. carlson	1412d6e47a	Don't fail if we lack objects the server has A Git LFS client may not have the entire history of the objects for the repository. However, in some situations, we traverse the entire history of a branch when pushing it, meaning that we need to process every LFS object in the history of that branch. If the objects for the entire history are not present, we currently fail to push. Instead, let's mark objects we don't have on disk as missing and only fail when we would need to upload those objects. We'll know the server has the objects if the batch response provides no actions to take for them when we request an upload. Pass the missing flag down through the code, and always set it to false for non-uploads. If for some reason we fail to properly flag a missing object, we will still fail later on when we cannot open the file, just in a messier and more poorly controlled way. The technique used here will attempt to abort the batch as soon as we notice a problem, which means that in the common case (less than 100 objects) we won't have transferred any objects, so the user can notice the failure as soon as possible. Update the tests to look for a string which will occur in the error message, since we no longer produce the system error message for ENOENT.	2019-07-15 20:47:04 +00:00
brian m. carlson	61bc880d9e	tq: ensure we pass the correct Accept header in verify requests When we perform a verify request, we need to specify a proper Accept header along with the Content-Type header. Do so. Hoist the setting of the headers before the code which copies headers from the remote side, so that the remote side can override it with a suitable header if necessary. This is required for compatibility with some existing servers.	2019-05-22 21:01:54 +00:00
Hidetoshi Hirokawa	0f1e9b7523	tq/adapterbase: fix typo enableHrefRerite to enableHrefRewrite	2019-04-04 11:04:00 +09:00
Hidetoshi Hirokawa	a935aec8ab	tq/adapterbase: add lfs.transfer.enablehrefrewrite config For backward compatibility and some incompatible situations such as using SSH protocol, href-rewriting is only enabled if lfs.transfer.enablehrefrewrite is set to true explicitly.	2019-04-03 13:01:37 +09:00
Hidetoshi Hirokawa	7da4901957	tq/adapterbase: support rewriting href Use insteadOf/pushInsteadOf aliases to rewrite href to upload/download LFS objects. This is useful in situations such like you need to access the LFS server via a reverse proxy.	2019-04-02 17:48:21 +09:00
brian m. carlson	73e8a713aa	tq: retry on oversize file If we're attempting to resume a download but the file we have on disk is the expected size or too large, we'll send an invalid Range header where the lower bound is greater than the upper bound. If this situation occurs, simply retry the download from the beginning, since we clearly don't have the expected data (or we wouldn't be resuming) and it isn't clear how else to recover.	2019-02-22 14:46:10 +00:00
brian m. carlson	3a278dbc49	tq: avoid nil pointer dereference on download failure In 4775f9c1 ("tq: avoid nil pointer dereference on unexpected failure", 2019-02-19), we fixed an issue on uploads where we could dereference a nil http.Response pointer when uploading. We should also check for this when downloading. Since the check a few lines below is now redundant, remove it.	2019-02-20 20:42:14 +00:00
brian m. carlson	4775f9c19c	tq: avoid nil pointer dereference on unexpected failure When we get an error making an API request, we check to see if the response is a 429, and if so, we attempt to back off and retry later. However, we failed to check if we actually had a response; if the response was missing (say, because we had a network issue), then we would dereference a nil pointer. Instead, let's explicitly check for a non-nil response before querying it for a status code. Co-authored-by: Taylor Blau <me@ttaylorr.com>	2019-02-20 14:47:35 +00:00
Zac Romero	4fe0a8db4f	Add integration tests; check other places where 429 could occur	2019-01-07 12:27:02 +03:00
Zac Romero	2197f76dde	Add support for retries with delays (ex rate limiting)	2018-12-26 16:30:53 +01:00

1 2 3 4 5

226 Commits