Last year, I learned how to extract directories or files from Hg/Git repos into Hg and now I have just learned how to do the same thing from Git to Git.

GitHub Help: Splitting a subpath out into a new repository is a great starting point. If you only need to extract an entire sub-directory, then


git filter-branch \
--prune-empty \
--subdirectory-filter wanted-subdir \
-- --all

After the command, the files in wanted-subdir will reside in root directory of the repository, instead of the original sub-directory.

But if you need more like a certain file, here is the command


git filter-branch \
--prune-empty \
--subdirectory-filter wanted-subdir \
--index-filter '
git ls-files |
while read filename; do
[[ $filename == wanted_file1 ]] || \
[[ $filename == wanted_file2 ]] || \
git rm --cached $filename
done
'
\
-- --all

The files in wanted-subdir will not all be in new history, only wanted-subdir/wanted_file1 and wanted-subdir/wanted_file2 will be kept. There might be easier way to do this, but I don’t know yet. Note that wanted-subdir/ string isn’t included in --index-filter command and at that point, git ls-files already gives the result after --subdirecotry-filter.

If the case is simpler, for example, you only need to remove a file, then just use:


--index-filter '
git rm --cached --ignore-unmatch \
file1_to_be_excluded file2_to_be_excluded
'

You will need the --ignore-unmatch or it may return non-zero exit status and halt the rewriting process. The unwanted file may not be index.

You may want to read git-filter-branch(1) for more information or filters like --tree-filter, --msg-filter, or --commit-filter. You can do quite a lot with filter-branch on rewriting history.