ConfigureSpace tools
Skip to end of metadata
Go to start of metadata

 EXTREMELY ROUGH DRAFT

 

Now that we have started to transition away from Subversion to Git (Github.com), we need to begin to adhere to a few basic guidelines. What is described in this document will help you and your fellow developers stay in sync and on track. SSB adopted the "forking workflow", which means you will perform most of your daily work locally, far away from the central repository. You will interact with a special copy of the code called a "fork", which is a clone of the central repository linked with your Github account. Do NOT write directly to the central repository unless you are performing maintenance tasks, or creating a release tag.

After creating a fork, you will clone it to your local system just like any other git repository. In this local copy, after you have made some changes, you can push them upstream by creating a "pull request" using the built-in Github.com interface. A pull request, at its core, is a snapshot of the code in your fork that will be merged into master on the upstream repository. Other developers can even help work on your code by forking your fork and creating a pull request against your repository.

The central repository for any given project is a "master-only" copy of the source code. Pull requests are made against this copy, and team leads comment, accept, or reject those changes. You may thinking: Wow all we need is another layer of complexity. Well, it isn't exactly complex. 

Before we begin

Please read this outline describing the "Forking Workflow" in great detail. Next, take a look at this tutorial on Github, because it is specifically tailored to the service we're using.

Configure the Git Client

Git requires you to input user-data to keep track of who is making changes. If you forget to do this, you will be prompted after making your first commit to fix the situation accompanied by a small set of instructions detailing how to do so. To avoid this annoyance, simply set the following configuration options (replacing "Your Name" and "you@example.com"):

git config --global user.name "Your Name"
git config --global user.email you@example.com

NOTE:

Be aware whatever you define here will be permanently attached to every commit you make. However, if you wish to be known to the public as "Codelover McBiscuits <cucumber@happymealsforastronauts.net>", that's your prerogative.

OK that's cool... but how do I do my job?

Imagine Megan says to you, "I'm going to be out sick for two weeks in Cancún but I really need you to implement a cool new feature in HSTCAL while I'm out. Believe me, I wish I could stay and code it all myself."
You say, "Gee, I'm sorry to hear that, Megan, but I'm more than happy to help out because I know how much Cancún sucks. I hope you feel better."

Back at your desk, with your newly realigned priorities fresh in your mind, you get to work...

Create a fork of HSTCAL

Navigate to https://github.com/spacetelescope/hstcal

Making a fork of the repository is as easy as clicking on the "Fork" button.

When prompted by Github to choose the account to store the fork in, you will select Snake Plisskin, or in this case, my personal account: @jhunkeler

Now Github will churn away for a few moments while it creates your personal fork...

And you will be automagically redirected to your fork's landing page.

 

Cloning the Repository

In order to start working with the code you will first need to obtain it! So let's do that...

In a terminal...
$ git clone git@github.com:jhunkeler/hstcal.git
Cloning into 'hstcal'...
remote: Counting objects: 14046, done.
remote: Compressing objects: 100% (1905/1905), done.
remote: Total 14046 (delta 12172), reused 13977 (delta 12124), pack-reused 0
Receiving objects: 100% (14046/14046), 10.35 MiB | 2.22 MiB/s, done.
Resolving deltas: 100% (12172/12172), done.
Checking connectivity... done.

Now change your working directory so that you are inside of the hstcal directory

$ cd hstcal

A little housekeeping

In order to keep your forked code in sync with the central repository (henceforth referred to as "upstream"), you need to tell Git you have more than one source of repository data. At this point go ahead add the upstream spacetelescope/hstcal.git repository as a target.

$ git remote add upstream https://github.com/spacetelescope/hstcal.git

What did this statement do? The TL;DR version is simple, you told git to add a remote repository URL called "upstream" to the list of available repositories. Now if you were to look inside of .git/config you would see this command appended a few lines:

cat .git/config #(truncated)
[remote "upstream"]
    url = https://github.com/spacetelescope/hstcal.git
    fetch = +refs/heads/*:refs/remotes/upstream/*

Also notice the remote "origin" is also defined in that file. The "origin" generally refers to the URL of the repository you cloned against. More on this later.

[remote "origin"]
    url = git@github.com:jhunkeler/hstcal.git
    fetch = +refs/heads/*:refs/remotes/upstream/*

There's an easier way, however. Just run git remote -v to review any/all repositories available to you.

$ git remote -v
origin  git@github.com:jhunkeler/hstcal.git (fetch)
origin  git@github.com:jhunkeler/hstcal.git (push)
upstream    https://github.com/spacetelescope/hstcal.git (fetch)
upstream    https://github.com/spacetelescope/hstcal.git (push)

Time to start working

Since you are transitioning from SVN, the first order of business here is to forget everything you know about maintaining your code. Git is not SVN, and treating it as such will only lead to suffering. Unless you're into that sort of thing. Basically you need to get out of the mindset of working inside of "trunk" (or "master" in this case). All new features, bug fixes, hot fixes, and any other Agile/Scrumm buzzword task goes into its own well-named branch. This branch will be merged, via pull request, back into the upstream repository.

Create a shiny new branch

Since we are not going to be working with master at all, just pretend it isn't there. Seriously. Instead, create a new branch with a thoughtful name that is both easy to read and identify the work being performed. Megan told us before she fell ill and hopped on a plane that we're implementing a "cool new feature", so we're going to call our new branch... "cool-new-feature". Another example would be if Megan had asked us to take care of issue #1234 while she was out, so then perhaps we'd called the branch "bugfix-1234".

Creating a new cool feature
$ git checkout -t -b cool-new-feature
Branch cool-new-feature set up to track local branch master.
Switched to a new branch 'cool-new-feature'
 
# Note: The same command without using long-options
# $ git checkout -t -b cool-new-feature

WARNING: If you omit --track or -t, Git would still generate a new branch, however you will be responsible for switching to it and setting up tracking manually. So to keep things simple, always use the --track or -t argument when creating a new branch.

Do some work

Man, this feature is going to be sweet. I can't wait until Megan gets back.

pkg/wfc3/calwf3/lib/cool.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <stdio.h>
 
void cool();
 
void cool()
{
    /* Hard work, math, and science */
    int neat;
    int stuff;
    int occurred;
 
    neat = 1;
    stuff = 1;
    occurred = neat + stuff;
 
    print("%d + %d = %d\n", neat, stuff, occurred);
}

 

Commit your work

Let's assume this magnificent piece of code was too awesome to just let it sit around and rot, so let's get the ball moving.

$ git add pkg/wfc3/calwf3/lib/cool.c
$ git commit -m 'Add cool new feature'
[cool-new-feature f753661] Add cool new feature
 1 file changed, 17 insertions(+)
 create mode 100644 pkg/wfc3/calwf3/lib/cool.c

You should continue to write code, tweak things, and repeat this process until you're satisfied your work is finished. Don't go overboard, either. If whatever you are working on looks like its about to break out of the scope of your "cool-new-feature" branch, then make a new branch called "cool-new-idea" and call it a day. Branches in Git are there to help you keep things simple and organized, so do yourself a favor and don't overcomplicate things. You'll see why this might matter in a bit.

Push those changes

Now that your code is looking good and you're happy with it, it is time to push those changes up to your forked repository.

$ git push -u origin cool-new-feature
Counting objects: 7, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 695 bytes | 0 bytes/s, done.
Total 7 (delta 4), reused 0 (delta 0)
To git@github.com:jhunkeler/hstcal.git
 * [new branch]      cool-new-feature -> cool-new-feature
Branch cool-new-feature set up to track remote branch cool-new-feature from origin.

Earlier when you looked at git's internal configuration you noted "origin" pointed to "git@github.com:jhunkeler/hstcal.git". The origin is your fork. We are not pushing anything to the upstream repository directly. The upstream remote is reserved for a totally different operation.

Check those changes

When the push operation completes, your local "cool-new-feature" branch is now available on your fork for everyone to see (and modify, if they so desire).

Since we're pretty confident Megan and the rest of the team will love this feature, we'll go ahead and make a pull request against spacetelescope/hstcal.git

 

Put in pull request for peer-review

If you click on "Compare & pull request", the following will appear.

As you can see, in the grayed-out box above the comment section, Github has pre-configured this pull request. If accepted, this changeset will be merged into upstream's "master" branch, from your fork's "cool-new-feature" branch.

On the same page, a little further down, it gives you a total overview of everything you're about to submit

If you're ready, then go ahead and click the "Create pull request" button. Maintainers will be notified via an automated email that a new pull request has been submitted. Of course, I didn't go through with it, because it would have cluttered up the pull request history on the upstream, so instead I'll show you an example of a pull request in progress elsewhere.

What a pull request looks like from the other side

(Yes I know this pull request is against my own fork, but hey, that's possible too!)

 

Clicking on the title of the pull request will load a page similar to the submission page, but is now used for a back-and-forth chat with your team. Sometimes a second pair of eyes can save you time, so don't be bashful. Here's a quick example of what your pull request timeline might look like during active development.

...

... and at the very bottom

Down here, if you have the proper administrative access to the upstream repository, you will see a giant green button named "Merge pull request".

If there is a giant gray '/!\' symbol where the green checkmark is in this screenshot... Fix the problem before merging.

Be aware that if you decide to circumvent the pull request process and push directly to upstream you will break the master branch in the upstream repository and your time will be lost cleaning up conflicts. Sometimes this is unavoidable but make sure your peers are aware of what is about to happen.

It goes without saying... Please merge responsibly.

On a cheerier note, you can also leave comments on your own pull requests. This little box can be used to ask your co-workers for help, to blame your co-workers for breaking something (using the @username convention), or even as a brainstorming pad to help you keep your thoughts in one place as development presses on.

After your pull request has been accepted

Your code has been incorporated into the upstream repository. The first thing you will want to do is DELETE YOUR BRANCH. You no longer need it. Its purpose has been fulfilled, so do not keep it around. Not even so much as a checkpoint in time. Your checkpoint has already been recorded in the form of a commit log message upstream: "Merged cool-new-branch ... abc43ea", and can be restored and/or reviewed at any time for all eternity.

# Delete the local branch
$ git checkout master
$ git branch -D cool-new-feature
 
# Delete your remote fork's branch
$ git push origin --delete cool-new-feature

And that's it. You're ready to create another branch, or start plugging away on the "cool-new-idea" you created just before things got out of hand.

Restoring a branch

If you ever find a compelling reason to restore a deleted branch, you can use the commit hash of your commit message to do so. The branch will look and feel exactly the way it did before you deleted it, there is absolutely no guesswork in this operation.

$ git checkout -t -b cool-new-feature-redux abc43ea

Keeping your fork up to date

Imagine you are not the only person creating a pull request, and the maintainers are merging others' requests at a rapid rate, so naturally your fork will quickly fall behind the upstream. This is not a big deal, because you already added the "upstream" (spacetelecope/hstcal.git) remote target after cloning your repository. So, let's get back in sync.

First we need to "fetch" information from the upstream. This operation assimilates new information about the remote repository, such as branches, tags, and other important commit hashes, like where HEAD (i.e "master") points to these days. Think of it this way; if you never fetch new data, you will never receive new data. The following output is from my fork of Continuum's "conda" repository and it just so happens to be morbidly out of date.

# Switch to your 'master' branch
$ git checkout master
 
# Fetch the latest and greatest commit hash pointers
$ git fetch upstream
remote: Counting objects: 906, done.
remote: Compressing objects: 100% (113/113), done.
remote: Total 906 (delta 596), reused 544 (delta 541), pack-reused 252
Receiving objects: 100% (906/906), 270.24 KiB | 0 bytes/s, done.
Resolving deltas: 100% (672/672), completed with 100 local objects.
From https://github.com/conda/conda
 * [new branch]      3.x        -> upstream/3.x
 * [new branch]      4.0.x      -> upstream/4.0.x
 * [new branch]      feature/instruction-arguments -> upstream/feature/instruction-arguments
 * [new branch]      kalefranz/conda-as-shell-function -> upstream/kalefranz/conda-as-shell-function
 * [new branch]      master     -> upstream/master
 * [new branch]      mgrant/arg2spec -> upstream/mgrant/arg2spec
 * [new branch]      mgrant/bad-installed2 -> upstream/mgrant/bad-installed2
 * [new branch]      mgrant/doc-fixes -> upstream/mgrant/doc-fixes
 * [new branch]      mgrant/fix-2259 -> upstream/mgrant/fix-2259
 * [new branch]      mgrant/hint-simplify -> upstream/mgrant/hint-simplify
 * [new branch]      msarahan/explicit_script_interpreter -> upstream/msarahan/explicit_script_interpreter
 * [new branch]      wulmer-fix/activate-behaviour-on-windows -> upstream/wulmer-fix/activate-behaviour-on-windows
 * [new tag]         4.0.4      -> 4.0.4
 * [new tag]         4.0.1      -> 4.0.1
 * [new tag]         4.0.2      -> 4.0.2
 * [new tag]         4.0.3      -> 4.0.3

Next, we perform a "pull" operation. Remember, "fetch" only grabs the pointers to the data, whereas "pull" introduces the changes into our local fork.

# Merge the contents of upstream into your local fork
$ git pull upstream master
From https://github.com/conda/conda
   a0f00fb..231265a  master     -> origin/master
 * [new branch]      4.0.x      -> origin/4.0.x
 + 326f5b0...5ecfee2 kalefranz/conda-as-shell-function -> origin/kalefranz/conda-as-shell-function  (forced update)
 * [new branch]      mgrant/arg2spec -> origin/mgrant/arg2spec
 * [new branch]      mgrant/bad-installed2 -> origin/mgrant/bad-installed2
 * [new branch]      mgrant/doc-fixes -> origin/mgrant/doc-fixes
 * [new branch]      mgrant/fix-2259 -> origin/mgrant/fix-2259
 * [new branch]      mgrant/hint-simplify -> origin/mgrant/hint-simplify
Updating a0f00fb..231265a
Fast-forward
 .travis.yml                      |   10 +-
 CHANGELOG.txt                    |   32 ++
 README.rst                       |    9 +-
 appveyor.yml                     |   13 +-
 auxlib/__init__.py               |   26 +
 auxlib/packaging.py              |  160 ++++++
 auxlib/path.py                   |  102 ++++
 cmd/activate                     |  116 ++++
 cmd/activate.bat                 |   47 ++
 cmd/deactivate                   |   86 +++
 cmd/deactivate.bat               |   24 +
 conda.recipe/bld.bat             |    2 +
 conda.recipe/build.sh            |    1 +
 conda.recipe/meta.yaml           |    5 +-
 conda/api.py                     |   19 +-
 conda/cli/activate.py            |  181 +++++--
 conda/cli/common.py              |   22 +-
 conda/cli/find_commands.py       |    6 +-
 conda/cli/install.py             |   66 +--
 conda/cli/main.py                |   51 +-
 conda/cli/main_bundle.py         |    3 +-
 conda/cli/main_clean.py          |   32 +-
 conda/cli/main_config.py         |  149 +-----
 conda/cli/main_info.py           |    8 +-
 conda/cli/main_list.py           |   27 +-
 conda/cli/main_remove.py         |    7 +-
 conda/cli/main_search.py         |    4 +-
 conda/cli/misc.py                |   17 -
 conda/common/__init__.py         |    3 +
 conda/{ => common}/compat.py     |    0
 conda/{ => common}/connection.py |   33 +-
 conda/common/download.py         |  217 ++++++++
 conda/{ => common}/lock.py       |   25 +-
 conda/{ => common}/utils.py      |   51 ++
 conda/config.py                  |   18 +-
 conda/console.py                 |    6 +-
 conda/fetch.py                   |  217 +-------
 conda/install.py                 |  131 ++++-
 conda/instructions.py            |    3 +-
 conda/logic.py                   |   10 +-
 conda/misc.py                    |    9 +-
 conda/packup.py                  |    8 +-
 conda/pip.py                     |    8 +-
 conda/plan.py                    |   28 +-
 conda/resolve.py                 |  313 +++++------
 conda/version.py                 |    2 +-
 setup.py                         |   14 +-
 tests/conftest.py                |   15 +
 tests/helpers.py                 |   31 +-
 tests/test_activate.py           | 1251 ++++++++++++++++++++++++--------------------
 tests/test_cli.py                |   17 +-
 tests/test_config.py             |  183 +++----
 tests/test_info.py               |   26 +-
 tests/test_logic.py              |    2 +-
 tests/test_resolve.py            |   39 +-
 tests/test_utils.py              |    3 +-
 utils/travis-bootstrap-conda.sh  |   15 +-
 57 files changed, 2358 insertions(+), 1545 deletions(-)
 create mode 100644 auxlib/__init__.py
 create mode 100644 auxlib/packaging.py
 create mode 100644 auxlib/path.py
 create mode 100644 cmd/activate
 create mode 100644 cmd/activate.bat
 create mode 100644 cmd/deactivate
 create mode 100644 cmd/deactivate.bat
 delete mode 100644 conda/cli/misc.py
 create mode 100644 conda/common/__init__.py
 rename conda/{ => common}/compat.py (100%)
 rename conda/{ => common}/connection.py (97%)
 create mode 100644 conda/common/download.py
 rename conda/{ => common}/lock.py (82%)
 rename conda/{ => common}/utils.py (64%)
 create mode 100644 tests/conftest.py

Pretty cool, eh? In this example we didn't experience a merge conflict, because no one else was working on the code we're playing with. Conflict resolution in Git is relatively simple, take a look here and here for more information about using the git mergetool utility. You can still fix the ugly <<< === >>> text markers by hand, but why bother? Just use mergetool! It will iterate over every conflict detected by the last pull or merge operation automatically. Keep things simple.

To push your updated repo to your remote fork (jhunkeler/hstcal.git), you would simply execute the following:

# Push branches, tags, and new code up to your fork on Github
$ git push --all origin

And magic... You're back in sync with the rest of the world.

Making a Release

Let's assume Megan merged your pull request, executed the HSTCAL regression test suite, and is now completely satisfied with the result. The code is part of the 'master' branch on the upstream (spacetelescope) repository, but how will the public obtain it? They could clone the repository as-is and build our software from scratch, sure, but there's a better way. If it isn't obvious by now, the new print statement we implemented is going to revolutionize science as we know it, so we had best start formulating a release.

Tagging

Imagine you are studying for a test. In order to mark important pages in your study guide ("Cooking Waffles in Orbit") you tack a post-it note with a short message to the top of each page, so whenever you need to refer back to certain information you simply reference the note and open the book to that location.

To make a public release we will use what are known as annotated tags.

Annotated Tags

In Git this is exactly how it looks (when compared to the book example above):

$ git tag \                  # "Cooking Waffles in Orbit"
-a 1.2.3 \                   # The post-it note
-m 'Revolutionary changes'   # The short message
1234abcd                     # The page (if omitted, it marks the current page)

This is best read as a sentence:

"Tag the commit 1234abcd as version 1.2.3 and make sure everybody knows 'Revolutionary changes' happened."

In the case of HSTCAL, let's review the tags already associated with it:

$ git tag -n
1.0.0           hstdp-2015.3
#...

Megan agrees our improvement to the code warrants a bump of the software version from 1.0.0 to 1.1.0, because although it's pretty sweet, it still doesn't change the functionality enough to require a full bump to 2.0.0.

# [Megan] Makes sure she is on the 'master' branch
git checkout master
 
# [Megan] Creates a new annotated tag based on the latest commit
git tag -a 1.1.0 -m 'Best print statement ever implemented'
 
# [Megan] Pushes the tag upstream
git push --tags upstream master

Lightweight Tags

Git also supports what are known as lightweight tags. Lightweight tags only refer to a specific commit hash, however they do not contain any identifying data. These tags are best compared to dog-earing the corner of the page in a book. They are quick-and-dirty by definition so that's why they're not generally available for public consumption. Tags such as these are designed to be applied locally, during development, and will not be pushed alongside annotated tags automatically when git push --tags is executed. 

You are free to use them to help aid your coding, but please try not to confuse them with their annotated (proper) counterpart. 

 

Conflict Resolution

(outline - also assumes you're working against an active PR on GitHub)

git checkout master
git pull --ff-only upstream master
# syncs upstream repo with local master
 
git push origin
# syncs origin (fork) with local master
 
git checkout busted-branch
git status
# Message indicates repos have diverged
 
git rebase origin/master
# Reports CONFLICT
 
git mergetool
# ---defaults---
# OSX: opendiff
# LINUX: undefined (suggest downloading tkdiff or using vimdiff -- pick your poison)
#
# NOTE:
# "mergetool" can be configured to always use a particular editor.
# If you configure mergetool in this way, you may simply run this:
#   git mergetool -y
 
# Current conflict resolved? Yes?
git rebase --continue
 
# No? Repeat mergetool/rebase-continue calls until all conflicts are resolved
 
git log
# Review your commits. Make sure you're satisfied, because if not, the next step will severely wreck your world.
 
git push -f origin
# Overwrites the existing busted-branch on your fork with resolved changes
 
# Check PR status on GitHub

 

 

 

2 Comments

  1. Hi Joe,

    It might be a good idea to document the standard form of HST and JWST tag names,  if such a thing exists,  i.e. a section on formal release naming.

    Todd

    1. Todd, I started working on that today. I'll try to keep adding more stuff this week.

Write a comment…