Source Code Control Concepts

Subversion, with the command line tool svn, is a revision control system, also known as a source code management system (scm) or a source code control system (sccs). The subversion server maintains a repository, where files are stored in a hierarchy of folders, same as a traditional file system. Files can be checked out and modified. The changes can be copied back into the reposition, called committing the change, creating a new version of the file. The versioning maintains the history of each file, as well as the history of the entire repository including the arrangement of folders, so that you can recover any version or revisit changes by looking at differences between versions.

There are three reasons we want to use a source code management system:

  1. Where and how you want to work on your code might not be where and how you want your code stored. Code should be stored where it will have reliable backups, where it will be accessible from a variety of locations and platforms, and where it can reside for years.
  2. Source code in a source code management system allows for collaboration. Using a source code management system, the TA can sync to your source code, downloading it, compile, and analyze the errors. The TA can put corrections into the code and comment back into the repository.
  3. A source code management system provides history of all files and folders in the repository, and for code branches and re-merging of branches. Individual files can be recovered, or an entire change set can be rolled back simultaneously to a previous repository version. In practice, this extensive history mechanism allows for confident cycles of code submission, and provides for the possibility of "code branches" or code forks.
Other popular SCCS include Git, which provided as a service at Github, Perforce, and CVS.

Subversion versus Dropbox

Recently cloud file services have coalesced around a model of operation exemplified by Dropbox, so it might be helpful to contrast Subversion and Dropbox.

Subversion does a lot of what Dropbox does. It provides a way of sharing files, and of keeping files safe in a location separate from your local computer. However, Dropbox works continuously and thoughtlessly. Subversion you have to demand that files be pulled from or commited to the repository. The difference is a matter of requirements. When working with code, not every version of a file is valid. You might be making a large change across several files and you do not want the code shared until all the changes are tested.

Also, the commits made are all-or-nothing. If a commit requires changing multiple files, there will never be a moment in which some of the files have been updated and others have not. If subversion cannot make all changes, the entire change set is refused.

Everything is more deliberate in subversion than in dropbox.

  1. Just because a file is in a folder in a subversion controlled file system does not mean that the file contents will be mirrored into the repository. After creating a file it must be added, which more precisely means, you must indicate that the file and its changes will be tracked. In Dropbox, you just drop a file into the Dropbox folder and it is assumed you wish it mirrored and its changes tracked.
  2. To delete a file it is not enough to run rm at the command line, nor to drag and drop the file into the waste bin. Use the command svn delete (or svn move, for a move). This will do two things: it will make the deletion and it will schedule that a deletion notation is pushed to the repository at the next commit. If you just rm a file, subversion will helpfully restore the file with the next update.
  3. By the way, nothing is ever truly deleted from the repository. A deleted file can be restored by looking through the repository history, pulling out an old copy, and re-committing it. There is no equivalent of an "Empty Trash" action to purge thoroughly a deleted file. Once committed, the complete removal of a file is impossible by the client, and difficult even for the admin of the repository server.

Most importantly, Dropbox is not very clear about what it will do if collaborators on a file make conflicting changes to the file. This is a difficult issue, and Dropbox succeeds brilliantly for its taget audience by a studied ignorance of the issue. It is not something the typical Dropbox user will worry about.

However, developers encounter these conflicts, and worry about them a lot. These editing conflicts are a major issue for source code control systems and they have complex and deliberate conflict resolution mechanisms.

Pull–Modify–Commit

The basic work cycle of subversion is Pull–Modify–Commit:

It is possible that while one programer was modifying one working copy, another programmer has modified and committed changes in another working copy. It is possible that these two modifications conflict. If the change conflicts, the commit will not succeed. Not only will the conflicting file not be pushed, all changes in the commit are aborted.

If you have a conflict, there are several resolution paths:

Version Numbers (the theory)

A repository has a version number. Beginning with an empty repository with version number 0, each commit increases the version number by one. Repository version N is therefore the state of the repository after the N-th commit. A file or directory is said to be version N when it is as it appears in version N of the repository.

This can be counter intuitive. The common notion of the "versions" of a file would be the successive changes in the file. In subversion, most "versions" of the same file are the same, as they are carried forward unchanged to the next repository version. The revision in which a file last changed is given by the revision specifier "COMMITTED". The subversion status command when used with the -v option gives both the revision of each file and its COMMITTED revision.

When a repository is pulled, a working copy is made of all the files and directories. The revision of the repository at the moment of copy is called the base revision. Copies of the files as they appear in the base revision are retained by the local subversion client in order to either revert the working copy back to the base, or to calculate the differences between the working copy and the base. In a commit, these differences are transmitted to the repository.

A modified file is a working copy of a file that is different than the base copy. This happens because the programmer has modified the working copy. A file has an out-of-date base if the base revision number is less than the current committed revision number of that file in the repository. This happens because some programmer has modified the file since the the working copy was created, and has committed those modifications. Note that to determine if a base is out-of-date requires communication with the repository to query the current committed revision for the file.

A file is in one of four states:

Commits and updates have actions depending of the state, as given by the this table:

WorkingBaseCommitUpdate
unmodifiedup-to-dateno actionno action
modifiedup-to-datepushno action
unmodifiedout-of-dateno actionpull
modifiedout-of-datenot allowedresolve conflict
It is a subversion principle that pulls and commits are independent.

Commits are all-or-nothing. If any file will cause a conflict to resolve, nothing is pushed. Assuming the commit occurs, the repository revision number is advanced and the local base revision for all pushed files becomes that revision number. The base revision number for all other files remains unchanged.

For directories, their base is never changed by a commit, even if the directory has been modified (a file has been added or removed). In this case, the revision information for a directory will be erroneous.

Update will bring all files in the subtree of where it is applied to the current revision number of the repository. The subversion will pull any files which have a newer commit revision than the local base version; and directories will be updated to reflect added, deleted and moved files and directories. After the update, all the base revisions are set to the current revision in the repository.

Files that are in conflict during the update will need to be resolved. Resolving the conflict will involve editing the working copy so that the contents reflects both sets of changes: those made on the local platform, and those committed independently in the file. Resolving a conflict will reset the base to the repository's committed version, and the working version will be the merged and edited file, ready to be committed.

If a commit is refused because of conflicts, the update operation is used to resolve those conflicts. The update operation can resolve some conflicts automatically, by merging the local modifications into the repository's current version of the file. It is also possible to revert the local copy, update, and then the programmer can reapply the modifications to the now-current working copy.

Subversion Activities

At the command line, subversion is controlled through the program "svn". The svn program has subcommands like checkout, add, commit, that are invoked as the second word of the command: e.g. svn checkout, svn add, and svn commit. For installing subversion, see below. Subversion is installed on a Mac that has XCode installed.

The Pull

An initial pull into an empty directory creates a working copy of the repository on your local machine. The directory is created if it does not exist. The name of the directory is arbitrary — your choice.

To pull you will need: the repository hostname; the path to the specific repository on the named host; a username; a password; and the protocol that will be used to pull.

CSC421.171 specific information:

Students in CSC421, Fall Semester of 2016-17 will receive their username and password by email. The hostname is svn.cs.miami.edu, the path is classes/csc421.171, and the protocol is svn. These last are compounded into the repository access URL svn://svn.cs.miami.edu/classes/csc421.171. And example pull, alias checkout, alias co, would be:

    svn co --username pikachu --password svengali \
        svn://svn.cs.miami.edu/classes/csc421.171 mycsc421stuff.svn

The result will be a file tree descending from newly created directory mycsc421stuff.svn.

The svn command takes the arguments --username and --password. Each use of the svn command saves the username and password so that subsequent svn invocations need not use these arguments if the previous values are correct. How these values are saved is platform specific, and subject to change.

After a working copy has been pulled, changes in the repository source can be integrated into the working copy by a pull called an "update". The update works from the current directory and all descendent directories and files, and will replace files unmodified in the working copy with newer versions, or will attempt to merge changes into files which have been modified since the last pull. Merges can succeed silently, or if there are conflicts that cannot be silently resolved, the user will be informed that the merge must be handled manually.

Updates are done by the command svn update. The repository URL, etc, are inferred from your current directory. You can focus an update on a sub-tree of the entire check-out by moving the current directory to the root of the sub-tree before issuing the svn update command.

For CSC421, update often the class subdirectory in your working copy, to take my changes. Since students do not have commit permissions on this directory, any conflicts should be remedied by replacing the local copy with the repository version. If necessary, revert the local class subdirectory, svn revert, to explicitly reverse any subversion significant changes you have made in that directory.

The Commit

Changes are pushed into the repository, and made a permanent part of the repository with a new version number, using the commit action: svn commit.

Committing new files and folders

The commit will push all tracked (added) and modified files descendent of the current directory. Files checked out from the repository are tracked. A newly created file or folder is not tracked until the svn add command is run naming that file or folder. Typical creation of a file is therefore a two step process:

  1. Use mkdir or touch to create the folder or file, respectively.
  2. Use svn add to start tracking the folder or file.

The antidote to add is revert. If you do not want to track a file, you cannot simply remove it. Subversion has it in its head that that file is supposed to exist. If you wish to abandon the creation of a file or folder, and you have not yet committed or succeeded in trying to commit the file or folder, run svn revert _filename_ on the file or folder. This will cancel the add. Then you can "rm" the file or folder locally, if you wish.

Committing changes to existing files and folders

In the simple case, svn commit will push the new version into the repository, and the version number of the repository will be incremented; and you local version will be marked as current.

However, if the file and folder modified in the working copy has been modified in the repository since the pull of that file or folder to the working copy, the commit will fail. The file is said to be in conflict and conflicted files and folders cannot be committed. The conflict must first be resolved.

Resolve conflicts by running svn update to pull the repository version into the working copy, integrating the changes in the repository file into the working copy. Svn update will integrate some changes automatically by merging the files. These are mergeable conflicts. If all conflicts are mergeable, the merged working copy is considered a modification of the current repository version, and can be committed.

If unmergeable conflicts remain after the update, the conflicts must be resolved by hand, and then svn resolve is run to mark the conflict as resolved, and clean up certain temporary files created by subversion.

Subversion will offer to resolve the unmergable conflicts interactively during svn update, to which you can postpone the resolution. If you postpone the resolution, subversion will mark the file as conflicted, and will leave three additional files in the directory, as well as possibly modifying your working version according to its "advice" on the unmergeable conflicts. You can then edit or copy or move these files. Once finished, the svn resolve command to clean up the temporary files and to remove the conflict marker. Then run svn commit again.

Svn resolve has the option --accept which takes several values. If you have edited your working version, and consider it definitive, use:


     svn resolve --accept working

then commit. The other safe possibility is to revert your working file to the unmodified, then update to get the newer version, and then reapply your changes, and then commit.

Working with Subversion'ed Files

The svn status command should be run to keep you sane about what is going on.

The verbose mode, svn status -v, will provide you a very concise vision of what is being tracked, what tracked files are modified and will be pushed on commit, what newly tracked files will be added to the repo on the next commit, and what removed files will be removed from the repo (but they are not really removed — they remain in the repo history).

The verbose switch tells you what the working copy knows. Use the -u switch to contact the repo to see what has been modified in the repo, and might require an update before a commit.

Deleting, copying and renaming files will seem cumbersome. Even though you can do what you want on your machine, source code control systems really cannot follow these sorts of modifications automatically, and will need guidance. Often subversion seems to be hostile and petulant when trying to accomplish deletes and renamings.

The command svn revert will replace your working copy with the pulled version on which the working copy is based. As long as you make a local copy of whatever will be reverted, this is often a good way to straighten out misunderstands between you and the repo.

The command svn delete is the proper way to delete a tracked file. It will both do the local rm, it will notate that you wish the file removed and will inform the repo of this on the next commit. Note that the repository continues to have a copy of a deleted file. A delete file can be recovered. Generally you will need to know the name of the file so you can use svn log to look for the last repo version that contained the file.

To rename, use svn move. This is a combination of svn copy and svn delete so that on the next commit subversion understands your intent and will make the proper tree adjustments.

If there is a problem during the commit of tree modifications, the resolution is a bit trickier than with file conflicts. Generally, a change to the tree structure should be noted in the commit comment, and you might need to abandon your changes, get the new tree structure right, then reapply your changes and commit.

Installing subversion

You can access your subversion repository from the internet using the svn URL and subversion client software. You will need to have or install the subversion client software, the command "svn". OSX ships with svn native. If your Ubuntu (or other Debian derived) image does not have svn, install it with sudo apt-get install subversion. Periodically, apt-get needs its indexing of installables refreshed. Use sudo apt-get update.

For windows, I suggest installing cygwin, a unix like subsystem for windows. You can install svn as part of cygwin.

History of Subversion:

The first source control system was SCCS, written in 1972 by Marc J. Rochkind at Bell Labs. It was ported to unix and is still part of Unix today. Its successor, RCS, was written for unix in 1982 by Walter Tichy. That was supplanted by CVS, written in 1986 by Dick Grune. CVS is used by many important open source projects such as FreeBSD. (The FreeBSD source repository switched from CVS to Subversion on May 31st, 2008.)

Subversion is a product of CollabNet, which was founded by Tim O'Reilly and Brian Behlendorf in 1999. CollabNet started the Subversion project in 2000, and since 2010 it is also known as Apache Subversion.

You maybe be wondering about the illustration that opens this page. You might ask: Why Barbar The Elephant? That's simple: Because an elephant never forgets. What other explanation for that choice could there be?

You can stop reading now, there is really nothing interesting following.

————

No, it's not because elephants don't forget. Please, don't you see?

Subversion is really the instrument of a vast digital illuminati only posing as a software control system. Once you understand this, it all becomes clear: subversion's useless error messages, its seemingly careless shortcomings, its completely aberrant command syntax — all a deliberate mask of mystery over oddly shaped concepts that you recognize as keys once you see the locks.

Why else would it be named Subversion?

Look the word up in the dictionary, if you don't believe me.

————

Update history: