There are two ways to specify an alternate in Git: via the repository
file or via the environment. gitobj recently learned how to accept the
contents of an environment variable and use it, so pass the environment
variable when creating a new object database.
When performing a fetch, we don't use the object scanner, but we do when
performing a push. To make sure that we're exercising the object scanner
and gitobj adequately, simulate a push using alternates as well.
Per our work in the previous commit(s), it is now suitable to call
formatRefName() again thus avoiding our "hack" to prepend the remote
name to the reference before calling '(*git.Ref) Refspec()'.
As such, let's do that.
Co-authored-by: brian m. carlson <bk2204@github.com>
Since 'formatRefName()' no longer has multiple case statements, let's
simplify the body to use an `if`, instead.
Co-authored-by: brian m. carlson <bk2204@github.com>
In 'formatRefName()', we special-case references belonging to a remote
in order to round-trip a reference name that 'git-rev-parse(1)' (et.
al.) will understand.
This is done because the `*git.Ref` type does not retain information
about the remote to which it belongs, so we add it back here as a
shortcut to avoid making this change throughout the codebase.
Falling back to ref.Name, however, can introduce ambiguity when there is
a top-level reference of the same name. For example, calling
'formatRefName()' on a reference named 'refs/heads/foo' when there is a
reference named simply, 'foo' will cause Git to complain.
Avoid this by calling `(*git.Ref) Refspec()`, which reintroduces the
prefix thus eliminating any ambiguity.
Co-authored-by: brian m. carlson <bk2204@github.com>
Since 5e0f81d2 (Adding utility methods to get recent refs, 2015-08-17),
we have treated references beginning with the prefix "refs/remotes/tags"
as being of type "remote tag".
This is not a namespace that upstream Git uses in practice to denote
remote tags. Instead, tags are automatically namespaced into our local
refs/tags, hence rendering this prefix (and its type) meaningless.
As such, let's remove it so that the 'git lfs migrate' (et. al.)
commands do not get confused about references that do not exist.
Co-authored-by: brian m. carlson <bk2204@github.com>
We have used the getRemoteRefs since [1] to return a slice of all remote
references, without distinguishing which remote they came from.
The reason for doing so was that we had had modified each reference's
"name" (i.e., the last element of splitting the full reference name by
'/').
This was OK to do, since we modified the reference's name (i.e., the
last component of the fully qualified reference name as read from Git
when splitting on the '/' character) to include the remote name. This
allowed us to call '(*git.Ref).Refspec()', which would theoretically
format us a correctly printed reference name, for all kinds of
references.
This is broken in practice, so to prepare for a future commit that will
fix it, let's return a map of remote name to all references present on
that remote, such that we can format each based on its remote and full
name individually.
[1]: ce89eb35 (commands/command_migrate: teach how to find all remote
refs, 2017-06-09)
The .git directory within a worktree is necessarily incomplete.
Typically, it will lack an objects directory, among other things, since
the object storage is shared among all worktrees. The .git directory
belonging to the main repo contains all these missing items and is
referred to as the common directory.
Add a function to look up the common directory so that we can determine
the storage location for objects. Explicitly check for the version and
fall back to looking up the plain Git directory if worktree support
isn't available. This is required since git rev-parse passes through
options it doesn't support, so not checking for the version would mean
that the common directory would always be "--git-common-dir" on older
Git versions.
Update the one existing caller that uses gitobj to use this function for
finding the objects directory correctly.
Sometimes when invoking 'git lfs migrate import', mysterious output like
the following can be seen:
$ git lfs migrate import
migrate: override changes in your working copy? [Y/n] y
migrate: changes in your working copy will be overridden ...
migrate: Fetching remote refs: ..., done
Where an extra newline is printed between the answer, 'y', and the next
line of output from the migrator.
Instead, let's only print that secondary newline when one isn't given in
the answer. This should never be the case (c.f., ReadString()), but will
harden the code to changes like opening /dev/tty in raw mode and reading
character-by-character.
To do so, ensure that the answer doesn't satisfy:
strings.HasSuffix(answer, '\n')
...and print a newline iff it doesn't.
In a related sense, ignore io.EOF errors.
In 'git lfs migrate import' and 'git lfs migrate export', Git LFS makes
destructive changes to a caller's repository and therefore invokes 'git
checkout --force', which throws away local changes.
To prevent this, let's introduce a check that notifies users when they
are going to throw away local changes, and allows for the user to abort
if they do not wish to discard their local changes.
With this, a user can safely migrate over dirty repositories (i.e., in
the case that they wanted to fix a file that should have been in Git
LFS, but wasn't) without having to finagle their repository to get it
into a migrate-able state.
For users with lots of pending migrations, also teach --yes, which
allows a user to avoid the check, and instead simply prints the warning
message to STDERR.
In [1] and [2], callers of the 'git lfs migrate' command were surprised
when `--everything` did not migrate over everything, as its name
implies.
In the documentation, we specified that `--everything` applies only to
local references, as the default behavior of 'git lfs migrate' is to
never require a force-push unless asked to do so explicitly.
--everything sounds dangerous enough that it would imply that a user
wants _everything_ in their repository migrated. So, let's loosen the
requirement and make it mean that.
Alternatively, we could change the meaning of `--everything` in this
fashion and replace it with `--everything-local`. We could also
introduce `--force`, leave the meaning of `--evrything` unchanged, and
only exhibit this behavior with `--everything --force`. Both of these
options add too much surface area and complexity for use cases that seem
less-common, and/or could be accomplished with clever `git for-each-ref`
and `xargs`-ing.
[1]: https://github.com/git-lfs/git-lfs/issues/2984
[2]: https://github.com/git-lfs/git-lfs/issues/3118
In 5346b027 (commands/command_migrate.go: introduce '--fixup' flag on
'import', 2018-07-06), we explicitly made the Filter member of a
*githistory.Rewriter nil to avoid considering filters given over
--include or --exclude.
Since we explicitly ExitWithError if we detect either --include or
--exclude, we can avoid this check, since we know that if we've gotten
this far, both --include and --exclude will be unset, so any filter in
place will behave as a no-op.
A common invocation of the 'git lfs migrate import' command is with
'--include' and/or '--exclude' flag(s), which specify wildmatch
pattern(s) for which paths to migrate and/or not migrate.
This is useful for retroactively importing a set of files into Git LFS's
care, or fixing up a file that should have been tracked by Git LFS but
was accidentally committed as a large object instead.
In the later case, it is often the reality that a user will run 'git lfs
migrate --import' with an '--include' path that they believe will gather
the file (and the file alone). This approach is brittle because it
requires the user to infer not only the applicable pattern but the
meaning of that pattern. It also requires the user to run more than one
migration when fixing multiple types of files.
The .gitattributes file(s) contained within a repository provide an
authoritative source on what file(s) are considered by Git to be tracked
in Git LFS. We can use this information to infer the correct patterns to
``fix up'' a broken repository.
In the simplest case, if a repository's .gitattributes file contains the
following:
*.txt filter=lfs merge=lfs diff=lfs -text
But a .txt file matched by that pattern is not parse-able as an LFS
pointer, it will appear as unable to checkout.
Running 'git lfs migrate import --fixup --everything' will correctly
traverse history and find the affected .txt file, read it, create an
object file for it, and store it as an LFS pointer in history.
Thus, a user can run one command which will recognize arbitrarily
complex problems where a file should be tracked by Git LFS, but isn't.
Later, this feature could be combined with the new 'git lfs migrate
export' functionality to also clean files _out_ of Git LFS to object
files when they are not supposed to be tracked as Git LFS objects.
To determine the paths to migrate from a repository's .gitattributes, a
caller must do the following two things in order:
1. Read the .gitattributes file(s) in a given tree contained within
the repository.
2. Rewrite blobs according to the attributes applied to their paths
via the .gitattributes file(s) read in (1).
The framework for accomplishing the task necessary in (1) was written in
the previous commit. This commit introduces the rest of that mechanism
for (1).
Because a Git object's SHA-1 signature depends on its children, we must
visit the object graph in a topological ordering. This is not sufficient
for our purposes, since the patterns in a .gitattributes file cascade
downwards.
In other words, while we have to migrate from the leaves of the tree to
its root, we have to read the .gitattributes file(s) from root to
leaves.
To accomplish this, we introduce a new callback function in the
*githistory.RewriteOptions structure, TreePreCallbackFn, which is called
once as soon as a tree is opened for the first time, and before any
blobs or sub-trees are rewritten.
This provides the optimal time to inspect the repository's contents for
interesting .gitattributes files before migrating the blobs within.
We will use this new callback function in the following commit in order
to do precisely the task as described above.