May 25, 2017 by Daniel P. Clark

Leveling up with Git

This past weekend I got to spend 4 days at the Ruby for Good event.  This is an event where programmers get together and volunteer their efforts in charitable programming for many needs in our communities.  The team I joined up with consisted of 10 developers and most of the team grouped together in about 2 persons per domain to be implemented.  Each pull request on Github needed to be reviewed before being merged in and this process has lead to me leveling up my git  skills.

I’ll be using greater than and less than symbols as wrapper around text you would substitute with your own naming — eg: <some_name_here>

git Basics

For a quick review I’ll go over some basics with using git.  When using a site such as Github they’ll typically display the first few steps to get your project set up and linked with them.  These are most of the thing’s you’ll be doing after that.

  • git add .  — add all the changes made in the current directory with subdirectories (anything listed in the .gitignore  file will be ignored)
  • git add <specific/file/to.add> — add just one file to be committed (can be done multiple times for multiple files)
  • git commit -m <quoted_description> — create a git commit which can be submitted to your public or private code repository with a quick message
  • git commit — create a git commit which can be submitted to your public or private code repository and open your default command line text editor to fill out a potentially long description and the changes made
  • git push <remote_source> <branch_name> — push the code up to the remote repository into the branch name given
  • git pull <remote_source> <branch_name> — fetches and merges remote’s branch locally
  • git fetch <remote_source> — make the remote code available locally without merging in the code
  • git merge <remote_source>/<branch_name> — merge the fetched branch in locally
  • git branch — list branches on current machine
  • git checkout <branch_name> — switch the code over to what’s in chosen branch
  • git checkout -b <branch_name> — checkout branch if it exists, otherwise create it and switch to it
  • git branch -D <branch_name> — delete a branch locally
  • git push <remote_source> :<deleted_branch_name> — delete the branch on the remote repository

These are all the commands I would use for my own personal repository.  When you are the only one committing to a repository you won’t need to do much else.

Working with a Team

When working with a team and having some one review your code there are some additional steps to take with how you manage your code submissions.  I’ll do my best to illustrate how this works.

The code you’ll be working with should be in 3 locations.  The main project repository will be know as the upstream repository.  You will then need to fork it which will create a copy of the code under your account on Github.  Once you have that you can clone it to your personal computer to work on.  Once you’ve cloned the fork to you personal computer this is when you add the upstream to you’re git’s config by typing:

git remote add upstream https://github.com/<ORIGINAL_OWNER>/<ORIGINAL_REPOSITORY>.git

Or if you use ssh keys instead of passwords for Github then you’d type:

git remote add upstream [email protected]:<ORIGINAL_OWNER>/<ORIGINAL_REPOSITORY>.git

Now you have two repositories you can pull from locally; origin (your fork), and upstream (the original repository).

If the original repository (upstream) has been updated since you created your fork then you will want to update your fork as well.  To do that you would take these steps.

git checkout master
git fetch upstream
git merge upstream/master
git push origin/master

Let’s say you’ve already started work on a feature branch before doing this.  This would mean that your feature is not caught up with the code in the master branch.  After doing the above you will need to update the changes from the master branch to your feature branch by typing:

git checkout <feature_branch>
git merge origin/master

During both of the times you run the git merge  command a CLI text editor will open up for a new commit message.  Using the default “merge from master” is okay.  If there are any problems merging the code in you will see a list of which files have code conflicts.  You will need to manually edit these to remedy a fix.  The merge will likely have pasted two blocks in the part of the code where there is a conflict with the previous version of the code and the update.  Simply remove the one that doesn’t belong.

Once your feature branch is ready to be published and for you to request a pull request you can type:

git push origin <feature_branch>

Then visit the original repository on Github in your browser and you will see a highlighted line near the top of the page of your fork/branch name with a button to create a “New Pull Request”.  Go ahead and submit that there and enter in the comment section Fixes #34  if your PR will fix or complete the issue from Github’s tracker numbered 34.  This will automatically close that issue once the Github PR is accepted.

Added Complexity

Some times your PR can’t be reviewed and accepted within a reasonable time frame.  Also you may need to continue work and need to work on top of your previous work.  And if the project is active enough where the master branch is changed while you’re coding you will begin to dance between branches a lot to merge changes into your feature branches.

To start work on an additional branch on top of your previous branch you need to simply checkout the new branch while being in the previous one.  This will create you new feature branch with code based on whatever branch environment you did the checkout from.

git checkout <feature_branch>
git checkout -b <second_feature_branch>

Now you can write all your new code for the second feature and submit it for a PR as well.  This second PR will show all of the git history from the previous PR so your pull request will look a bit more cluttered and have somewhat unrelated details in it.  But there’ not much you can do about that.  It is what it is.

Now when the team makes changes to the upstream master branch you have upwards of double the work of merging in updates.  Updating them will look like.

git checkout master
git fetch upstream
git merge upstream/master
git push origin/master
git checkout <feature_branch>
git merge origin/master
git push origin <feature_branch>
git checkout <second_feature_branch>
git merge origin/<feature_branch>
git push origin/<second_feature_branch>

Each merge along the way you’ll need to very and fix any conflicts.  The commit messages for the second feature branch will have the commit merges from both the previous branches in it as well as its owns.  So it looks pretty bad.  But this can be considered somewhat necessary in most cases as the history of git commits is important to many people.

History Cleanup (generally taboo)

If your commit messages don’t have any useful information in it then that is a practice you need to learn.  Let’s assume that the git history isn’t important in these two feature branches (generally not a good assumption).  So in this case we’re going to go ahead and clean up our git history.

Now there is a git feature called rebase  which is kinda nice for cleaning up multiple commits in one branch into a more legible git history.  But you can’t do this on any branches that have already merged the other branches commits in already… for example you may git rebase -i HEAD~2 on <second_feature_branch>  to merge two commits into one (if you choose to), but you can’t do it in <feature_branch>  when <second_feature_branch>  already has that specific commit update merged into it.  To do something like that you’ll need to do something more elaborate.

A way you can create two clean histories is to use git diff  to take the differences between two branches and export that to a patch file.  You can then take that patch file and apply it to a brand new clean branch.  This is what that would look like.

git diff origin/<feature_branch> origin/<second_feature_branch> > ../second_feature_branch.patch
git diff origin/master origin/<feature_branch> > ../feature_branch.patch
git checkout master
git checkout -b clean_feature
git apply ../feature_branch.patch
git checkout -b second_clean_feature
git apply ../second_feature_branch.patch

Now you have some identical branches of code.  The clean branches don’t have any of the previous commit history whereas the original feature branches still do.  There are some times where this may be a good idea, but generally we keep a history of the commits to code for reference to have extra insight into what’s going on and why.  Note: patches work with text only files and not binary files.

The above patch solution is what I do whenever I get an unfixable conflict where git stops cooperating anymore with merges.  So it’s a good one to have in your back pocket.

Sometimes during a fix you just need to pull one file from one branch to another.  This is very simple to do.  I’ll demonstrate a branch to branch file update on the local machine.

git checkout <some_branch>
git checkout <some_other_branch> <path/to/file/on/other/branch>

You will have now copied the file from the <some_other_branch>  into <some_branch>  and viola!

TDD and git commits

To make your git commit history valuable it’s probably best to add committing as the last step in the Red, Green, Refactor practice.  That way each small increment is documented in the code history along with any thoughts you had on why you implemented it that way.

The steps would go

  • red — you write a test and run it to get a failing result
  • green — you write the code that makes it pass
  • refactor — you both consider and re-implement the code to be more manageable and easy to work with in the future without changing the codes behavior
  • commit — you document the what and why

With a git commit history like this you will likely get a level of admiration and respect from your team as you have shown yourself to be thoughtful by leaving evidence of a well thought out process.  This is not super common in the developer community, yet this is the ideal we should strive to achieve.

Summary

git is a very powerful code management system and has many features that are to be discovered and taken advantage of as you go along.  I quite enjoy using it and even when situations get difficult and it pushes me to learn how to use it better.  I hope you will see it in the same light when you come into difficult situations.  It is at these times that we learn and are all made better for it.

I hope you enjoyed this blog post.  Feel free to share any git tips and tricks you have.  I look forward to your feedback and input.  Remeber to comment, share, subscribe to my RSS Feed, and follow me on twitter @6ftdan

God Bless!
-Daniel P. Clark

Image by Zach Stern via the Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic License

#bitbucket#code#fork#git#github#gitlab#PR#pull request

Comments

  1. Dave Aronson
    June 25, 2017 - 9:13 am

    The bit about the project being “active enough where the master branch is changed while you’re coding” reminds me of this writeup I did a while back of a project where that was becoming a royal pain: http://blog.codosaur.us/2013/03/all-right-break-it-up.html

Leave a Reply

Your email address will not be published / Required fields are marked *