Introduction

This tutorial contains all the necessary steps you must take in order to start this course. Keep in mind that starting to program is similar to learning a new language. There will be times of frustration and when you don't understand what you're saying/doing wrong, but perserverence is key to success when learning how to program. Another piece of advice for your programming journey is to focus on the structure, logic and concepts of programming.

This tutorial is structured in two parts:

  • One part being the simple version that is meant to get you setup in 15 minutes.
  • And a second part for those interested in a more comprehensive and in depth tutorial (this extend version will assume that basic instalations were already made).

If you are new to programming, we would suggest you keep reading on from here (the simple version). If you already have the basic programming set up jump to the second part named extended version. We begin by introducing the elements you need and then start with getting all set up accompanied with an explanation why you need it. Python, Jupyter Notebook and Github are names that will be familier to you after this tutorial.

Introduction to the Command Line

Terminal

The command line serves as an essential user interface which is navigated by text commands (prompts) rather than a mouse. It is possible to carry out all operations which often cannot be acessed through your normal GUI (Graphical User Interface). On Windows machines it is known as the Command Prompt and on Mac OsX & Linux operating systems it is known as Terminal. When begining to code it is essential to understand some basic commands in order to utilize the command line. Bellow is a table of essential commands with a brief explanation of how to use them.

Commands OsX Windows Linux
List Files ls mv ls
Find Current Location pwd chdir pwd
Change Directory cd cd cd
  • Lisitng Files: When the ls/mv command is entered, the command line will enumerate a list of all the items listed within the directory. When you first open the command line, this will be all the main components of your computer as seen in the figure above.
  • Finding Your Current Location: This command will return exatly where in which directory your are, and the path taken to get to the current directory. For example if the command line returns /User/Jordan/Desktop/Papers/PQA this means you are currently in the folder labeled "PQA", located in the folder "Papers", which is on the desktop of the computer's user "Jordan."
  • Changing Directories: Now that you know how to find where you are located you can begin to navigate in the command line. When cd alone is entered, the command line will return you to the original directory. It can also be used in conjunction with another directory to the specified direcotry. E.g. cd Desktop will take you to the desktop, and cd Desktop/Papers will take you to through the desktop to the folder "Papers."

For more commands follow these links for Linux, Windows, and Mac Os.

Introduction to Python

Python is a programming language and was designed to be readable and easily understood. To better understand what a programming language is, imagine people as computers, in order to get people to do something you have to communicate with them. Programming languages are generally much simpler than normal languages, but different, which is why many people struggle with them, just imagine how you would feel if you had to live in a village in India where they only spoke Hindi, that would be much more difficult. Many beginners find coding with Python highly satisfactory, as they are able to construct prototypes and tools quickly and with ease. But the benefits don't stop there, python is also free and has a large community to help you if you get stuck. It is arguably the most beginner-friendly language, which is why we recommend it in this tutorial to get you started with coding. So lets get started!

Installing Python with Anaconda

Open your browser and follow this link to the anaconda download page, and download the version recommended for your system (Linux, Osx, Windows). Once installed Anaconda is ready to use. NB: Always choose the latest version, which is the one with the highest number.

Jupyter Notebook

After installing Python, you will need to use an Integrated Development Environment (IDE) to begin coding. Try to imagine an IDE as Word or Pages, it helps you to build beautiful documents by writing text in your language, insert illustrations and format it while turning it into something you can use. If you didn't have Word or Pages how would you make your CV, assignment, or application? An IDE lets you write in a programming language and turns into beatiful applications that you can use. Similar to Word and Pages, IDEs help you with supporting functions in this process, like the debugging tool which helps you correct errors just like word helps you with spelling and sentence structure errors. Or by highlighting elements of code (e.g. variables, strings, numbers) for better understanding as well as automatic formatting, which makes sure parenthesis are closed or that lines are indented. Now that you understand the meaning and importance behind the abstract IDE acronym lets continue our progress towards getting fully set up for your coding journey.

In our opinion the best IDE to get started with programming is with Jupyter Notebook Jupyter is also included in the Anaconda Distribution, so if you have followed our instructions to the letter chances are that you already have jupyter notebook on your computer. If not then start following the instructions to the letter. I hear some of you complain "but I'm not gonna use python I'll use JavaScript", to that my response is firstly, don't, and secondly you have to have python installed on your computer to use Jupyter Notebook, so why make life difficult? Just install the Anaconda Distributer and you're set.

How to launch Python with Jupyter Notebook

Once you have Python and an IDE installed (Jupyter Notebook), you are ready to begin coding! To do so, you would need to launch Python and the IDE first. Lucky you are using Jupyter since it's capable of running the code within the IDE itself. If your IDE would not be capable of this, you would need to run your code on the command line or terminal. To launch Jupyter do the following.

Launching Jupyter Notebooks

  1. Open your Terminal/Command Line
  2. Launch Jupyter Notebook with the command jupyter notebook and Jupyter Notebooks UI will then open in your browser
  3. Open a new file in Jupyter's UI and begin coding!

Once you write your first lines of code in the IDE, and feel ready to try out your program, you can run it in your terminal if you are using an IDE other than Jupyter Notebook, or run it in Jupyter Notebook itself as follows.

Running Your Code

  1. Select the cell with your code
  2. Next, press either the run button or use the shortcut shift + enter. You should find that the latest version of Python you have installed has been started.

Now that you have everything set up, you are ready to start experimenting and building stuff. Python will give you the tool necessary to build various applications, but Markdown will help you edit text that you want to show in your application.

Introduction to Markdown

Markdown serves as the main text formating language for Jupyter Notebooks. Markdown is spacing and case sensitive. For instance, this means that when a user types mycode and myCode, the program recognises them as two different variables. The same goes for spacing - mycode and my code would be registered as two sepaprate variables. Markdown is very similar to HTML, as it is designed to be easily converted to HTML. The following is a list of 15 useful Markdown commands and how they appear in when the Markdown code is run.

Some Commands: Markdown Markdown

What is Git and Github?

It is an online distributed version control system which tracks changes made to a project file. In laymans's terms, it is a system which allows you to track all the changes made to a project. This is useful because it makes it easier to collaborate on projects since the system tracks the changes made by you and others. The online platform we will use to access this system is called Github.

To use Git, users would have to "clone" a copy of the online repository from Github onto their own hard drive and work on the file independently. After finalising the changes to the code, they will then upload their edited version back online. Git is primarily used for source-code management in software development, but it can be used to keep track of changes in any set of files.

Key Terms Used

Key Term Explanation
Version Control System A system that track changes in files over time and maintains a library of all past versions of those files. These previous versions may be recalled at a later time. A more detailed explanation is provided in Chapter 4.2.
Repository A folder containing all tracked files as well as the version control history. It can be saved onto a local folder on your computer or it can be stored on an online platform (i.e. remote repository). Github is an example of remote repository.
Snapshot Changes mades while developing a program which may later be committed.
Commit A snapshot of changes made to the staged files.
Stage The staging area holds the files to be included in the next commit.
Track A tracked file is one that is recognized by the Git repository from previous snapshots.
Branching Having multiple versions of the code simultaneously in a repository, where each branch has its own commit history and current version.
Local The version of a repository that is stored on your personal computer.
Remote The version of a repository that is stored on a remote (i.e. online) server.
Clone Create a local copy of a remote repository on your personal computer.
Fork Make a copy of another user’s repository on GitHub to your own account.
Merge To update files by incorporating the changes introduced in new commits.
Pull To retrieve commits from a remote repository and merge them into a local repository.
Push To send commits from a local repository to a remote repository.
Pull request A message sent by one GitHub user to merge the commits in their remote repository into another user’s remote repository.

Step-by-Step Github

Do you have a Github account?

  • YES: then sign in with your log in details.
  • NO: then sign up by creating a password and entering your email address.

Github

Now that you have downloaded the desktop version of Github you have choice between using the desktop interface to work collaboratively on projects or you may use the terminal directly. Below you will find an introduction how to use either of these two options.

  • I want to use the Github desktop to access github. Then click on Github desktop and watch a movie that will explain it much better than we could ever do in text format.
  • I want to use the terminal/command to access github. The command line is the only place where you can run all Git commands. So you should know how to open Terminal in Mac or Linux or Windows. The following table consists of the most commonly used command functions used:
Commands Explanation
git init Initializes a new Git repository and begins tracking an existing directory. It adds a hidden subfolder within the existing directory that houses the internal data structure required for version control.
git clone Creates a local copy of a project that already exists remotely. The clone includes all the project’s files, history, and branches.
git add Stages a change. Git tracks changes to a developer’s codebase, but it’s necessary to stage and take a snapshot of the changes to include them in the project’s history. This command performs staging, the first part of that two-step process. Any changes that are staged will become a part of the next snapshot and a part of the project’s history. Staging and committing separately gives developers complete control over the history of their project without changing how they code and work.
git commit Saves the snapshot to the project history and completes the change-tracking process. Anything that’s been staged with git add will become a part of the snapshot with git commit.
git status Shows the status of changes as untracked, modified, or staged.
git branch Shows the branches being worked on locally.
git checkout Git checkout followed by the name of the branche conducts you to the branch.
git merge Merges lines of development together. This command is typically used to combine changes made on two distinct branches.
git pull Updates the local line of development with updates from its remote counterpart. Developers use this command if a teammate has made commits to a branch on a remote, and they would like to reflect those changes in their local environment.
git push Udates the remote repository with any commits made locally to a branch.
git log Viewing the Commit History.
git help Getting help.

And much more.

Other Recources

When you begin coding you will inevitably run into challenges. Thankfully there are many online communities and platforms where coders come together to help eachother. Some of these platforms include Stackoverflow or subreddits like r/programming and r/pyhton on Reddit. Additionally, there are several free online recources to help you to enhance your coding skills, such as MIT OpenCourseware, SoloLearn, and Codecademy.


Extended Version:

Introduction to Python

What is Python?

Python is a general-purpose programming language, which means that it can be used for nearly everything. Unlike most programming languages, Python is an interpreted language, which means that the written code is not actually translated to a computer-readable format at runtime. This type of language is also referred to as a "scripting language" because it was initially meant for developing simple projects.

Python is also an object-oriented, high-level programming language with dynamic semantics, which makes it highly attractive for Rapid Application Development , as well as a tool to connect existing components together. Python can also be used to process text, display numbers or images, solve scientific equations, and save data. In essence, it is used behind the scenes to process many elements.

Python was designed for its users to learn syntax easily, hence its emphasis on readability. This reduces the cost of program maintenance as it enables teams to collaborate effectively without significant language and experience barriers. Furthermore, Python supports the use of modules and packages, encouraging program modularity, and code reuse across a diversity of projects. Once a module or package has been developed, it may be scaled for use in other projects. The Python interpreter and the extensive standard library are available in source or binary form, free of charge for all major platforms and may be distributed with ease.

Since its inception, the concept of Python being a "scripting language" has changed considerably. Python is now used to write large, commercial style applications instead of trivial ones. This reliance on Python has expanded even more so with the Internet gaining popularity. Today, a large majority of web applications and platforms rely on Python, including Google's search engine, Instagram, and the web-oriented transaction system of the New York Stock Exchange (NYSE). Even NASA utilises Python when to program their equipment and space machinery.

Key Advantages

Great for beginners

Python was designed to be readable and easily understood. Many beginners find coding with Python highly satisfactory, as they are able to construct prototypes and tools quickly and with ease. Thus, Python’s simplicity has propelled its popularity as a beginner-friendly language. Today, it has replaced Java as the most popular introductory language at top American universities.

Broadly adopted and supported

The Python syntax is designed to be readable and straightforward. This simplicity makes Python an ideal teaching language, and it lets newcomers pick it up quickly. Due to the modest number of features in the language, users invest relatively less time into developing their first programs. Thus, developers spend more time thinking about how to solve a problem and less time thinking about language complexities or deciphering code left by others. As you step into the programming world, one will realise how vital support is. The developer community is all about giving and receiving help. The larger a community, the more likely one may find help. According to the Tiobe Index, we see that Python is one of the most broadly adopted and supported programs. Currently, Python is the 4th most-used language on GitHub and has the 5th largest Stack Overflow community. Hence, it is no surprise that Python has an abundance of libraries developed by its community of users that assist with data analysis and scientific computing.

Key Disadvantages

Not easy to maintain

Being a dynamically typed language, a certain code in one Python file may easily mean something different in another depending on the context. As the Python language surges in complexity, it may become increasingly difficult to maintain as errors become difficult to track and fix. Hence, it takes experience and insight to know how to design your code or write unit tests to ease maintainability.

Slow

As a dynamically-typed language, Python has to do referencing for each new module and package that is opened, to determine what the definition of each variable is respectively. This slows down the performance of Python. A solution to this would be to use the alternative PyPy, which is a faster implementation of Python. While still may not be as fast as Java, there is nonetheless a significant improvement in processing speed.

Comparison to other languages

It is important to consider that Python may not be the ideal language to use in all situations. Although one of the most versatile, other languages offer features to address certain types of problems better than others.

Java

Python runs relatively slower than Java. However, due to the program's built-in high-level data types and dynamic typing, Python takes much less time to develop. Typically, Python programs are 3-5 times shorter than equivalent Java programs.

JavaScript

JavaScript and Python are similar in term sof their "objevct-based" subset. Also, both languages support a programming style that uses simple functions and variables without engaging in class definitions.

Unfortunately, apart from the aforementioned, that is all there is to JavaScript. On the contrary, Python has a larger capacity to support writing much larger programs and superior code reuse via its true object-oriented programming style, where inheritance and classes play a vital role.

C++

C++ is originated from C language and provides the feature of compilation. It is similar to Java in terms running speed. However, C++ codes tend to be 5-10 times longer than that of Python.

How to install Python

When installing Python you can download the latest version of Python independently or as part of a distributiuon like Anaconda. The benefits of installing a Python distribution like Anaconda includes the reduced risk of messing up the required system libraries and a access to a wide variety of pre-installed open-source packages. Both options are listed below.

Installing Anaconda

Open your browser and follow this link to the anaconda download page and download the version recommended for your system (Linux, Osx, Windows). Once installed Anaconda is ready to use.

Installing Python

Python can be obtained from the Python Software Foundation website at python.org. You will need to download the relevant installer for your operating system and running it on your machine.

Windows

Step 1: Download the Python 3 Installer

  • Open a browser window and navigate to the Download page for Windows at python.org.
  • Click on the link labelled "Latest Python 3 Release - Python 3.x.x." (As of now, the lastest version is 3.7.0)
  • Scroll to the bottom and select either Windows x86-64 executable installer for 64-bit or Windows x86 executable installer for 32-bit. (If your system has a 32-bit processor, you should choose the 32-bit installer. On a 64-bit system, either installer will actually work for most purposes. The 32-bit version will generally use less memory, but the 64-bit version performs better for applications with intensive computation.)

Step 2: Run the Installer

  • Once the installer has been downloaded, simply run it by double-clicking on the downloaded file. A dialog should appear that looks something like this:
  • Click Install Now. A few minutes later you should have a working Python 3 installation on your system.

MacOS

The best way to install Python 3 on macOS is via the Homebrew package manager.

Step 1: Install Homebrew (Part 1)

  • Open a browser and navigate to http://brew.sh/
  • Select the Homebrew bootstrap code under “Install Homebrew”, and copy it to the clipboard. Ensure that the text of the complete command otherwise the installation will fail.
  • Open a Terminal.app window, paste the Homebrew bootstrap code, and then hit Enter. This will begin the Homebrew installation.
  • If you’re doing this on a fresh install of macOS, you may get a pop up alert asking you to install Apple’s “command line developer tools”. You will need those to continue with the installation, so confirm the dialog box by clicking on Install.

Step 2: Install Homebrew (Part 2)

  • After confirm the “The software was installed” dialog from the developer tools installer, switch back to the Terminal.app window, hit Enter to continue with the Homebrew installation.
  • Homebrew will ask you to enter your password for permission to proceed with the installation. Enter your user account password and hit Enter to continue.
  • The installation will take a few minutes. Once the installation is complete, you will return to the command prompt in your terminal window.

Step 3: Install Python

  • Once Homebrew has finished installing, return to your terminal and run the following command:
  • This will download and install the latest version of Python. After the Homebrew brew install command finishes, Python 3 should be installed on your system.
  • You can make sure everything went correctly by testing if Python can be accessed from the terminal:
  • Open the terminal by launching Terminal.app.
  • Type pip3 and hit Enter. You should see the help text from Python’s “Pip” package manager. If you get an error message running pip3, go through the Python install steps again to ensure you have a working Python installation.

How to launch Python

Once you have Python and an IDE installed, you are ready to begin coding! To do so, you would need to launch Python and the IDE first. As mentioned before, some IDEs are capable of running the code within the IDE itself. If your IDE is not capable of this, you would need to run your code on the command line or terminal.

Windows

  1. Open Command Prompt
  2. Execute the following command: py filename.py

You should find that the latest version of Python you have installed has been started.

MacOS

  1. Open Terminal window
  2. Execute the following command: python filename.py

You should find that the latest version of Python you have installed has been started.

Integrated Development Environemnts

The next important step in preparing your machine to begin coding is to download an integrated development environment (IDE). IDEs serve as text editors, dedicated to creating an easier environment to design, write, organize and share your code. When choosing which IDE is right for you it is important to consider the following features:

  • Language Compatibility: Some IDEs are better suited for certain languages. However, most IDEs are capable of understanding numerous languges.
  • Debugging Tools: Good editors have debugging tools to help find and correct errors in your code.
  • Syntax Highlighting: IDEs generlly highlight keywords, symbols and variables which facilitates understanding of your code.
  • Automatic Fomratting: Majority of IDEs are capable of correctly indenting lines and automatically closing parantheses based on the chosen language.
  • Ability to Run Code from wihtin the IDE
  • Ability to Share Code on Github: Many IDEs include a simplified function for sharing code on Github and provide easy access to other repositories.

Spyder

Spyder is an open-source IDE included in the Anaconda Python Distribution. Spyder primarily targets data scientists and is specifically desgined for Python use.

Spider IDE

Visual Studio Code

Visual Studio Code is a text editor built on Electron. It is a light weight IDE which can be configured to work on almost any task, compatible with almost every language. It is also highly integrated with Git and Githib.

VS Code IDE

Atom

Similar to Visual Studio Code, Atom is an sleek editor built on Electron which was originally purposed for application design. The IDE was created by Github and has a large community around it. It supports Javascript, HTML, and CSS. However, Python must be run on an extension. One major advantage of Atom is its ability to incorporate packages from Github such as Hydrogen. Hydrogen enables it users to run select snippets of code within the editor itself, as is seen in the GIF below.

Atom IDE

Repl.it

Unlike the aformentioned IDEs, Repl.it is an online IDE where Python and Javascript are the primary languages used. It is not as capable as most IDEs, but it still allows for basic coding on multiple devices as it enables users to log into their account anywhere.

Repl.it

Jupyter Notebook

Jupyter is also included in the Anaconda Distribution and is a versatile tool for programming. Unlike the others, Jupyter is not a conventional IDE as users are able to document their work on it. However, Jupyter is not recomended for the sole use of writing complicated or extensive code. More information about Jupyter Notebook will be elaborated on in the subsequent chapters.

Jupyter Notebook IDE

Introduction to Jupyter

Jupyter

Jupyter is a platform which enables its users to display their programs in plain text while simultaneously sharing their original codes as well. In recent years, Jupyter Notebook has become increasingly popular in the scientific community due to its efficacy in combining scientific results with interactivle code in a plethora of programming langauges. Jupyter can also be used indepedently as an IDE to develop, create, and run your code, making it an increadibly useful and versatile tool.

Jupyter Intro

Installing Jupyter

Jupyter comes preinstalled with the Anaconda Distribution used for Python. However, if one has already downloaded Python independently of Anaconda, there are alternative ways to install Jupyter. You first have to check that you have the latest verion of pip installed. You may do so by executing the command pip install --upgrade pip. If you find that the latest version of pip is not intalled on your device, please follow the insructions on this link. Next use the command pip install jupyter to install Jupyter Notebook on to your device.

Launching Jupyter

For Windows users, launching the Jupyter Notebook can be done easily through the Anaconda application which may be accessed via the start menu. As for OsX and Linux users, the terminal window or command line have to be opened first. Navigate through to the files you want to launch in the Jupyter Notebook (using the aforementioned commands in the introduction), then enter the command: jupyter notebook. This will open the selected folders in Jupyter Notebook's online application in your browser. From this point you can either open your existing .ipynb files or create a new notebook which will look similar to the figure below.

Jupyter Menu

Once you have created a new notebook (or open a previously existing one), you can begin by finding the Help tab in the menu bar and selecting the user interface tour, which takes you through an overview of the features of Jupyter's user interface (UI). Some important featues include the cell function. Cells are containers for text or code to be displayed or executed by the notebook's kernel.

Cell creation example

Markdown cells are used for writing text, creating tables and inserting images. These cells are written in the Markdown code. A brief introduction of what Markdown is would be expounded in the subsequent section.

Coding cells is similar to an IDE, in that you can use them to create and run your code in the notebook. To do so, write the code in a cell like the example below. To run it, select the desired cell and use either the shortcut shift + enter or hit the run button in the cell tab at the top of the page. Try running the python code in the shell below!

In [1]:
print("Hello World!")
Hello World!

For a more exstensive tutorial of the Jupyter Notebook, follow this link.

Git and Github Extended

Github

GitHub is a collaborative code hosting site built on top of the git distributed version control system (DVCS) (refer to Chapter 4.2 for more a more detailed explanation on DVCS). GitHub reposes on a “fork & pull” model in which developers create their own copy of a repository that they then submit via a pull request. With the pull request, developers want the project master to pull their changes into the main branch.

Git GitHub Diagram

In addition to code hosting, collaborative code reviewing, and integrated issue tracking, GitHub has integrated social features as well. Users are able to subscribe to information by “watching” projects and “following” users. Some users can award stars to codes belonging to other users, which essentially has the same effect as "liking" a post on Facebook. Users also have profiles, that can be populated with their personal information, and contain their recent activity on the site. With over 57 million repositories hosted, GitHub is currently the largest code hosting site in the world.

GitHub Scroll

Version Control Systems

People might need to collaborate with developers on other systems. Version control systems are one way to do it. Version Control Systems record changes to a file over time so that that it is possible to recall specific versions later. In other words, version control systems allow one to revert selected files or even the entire project back to a previous state. Version Control Systems also allow one to compare changes over time, see who last modified something, who introduced an issue and when. We typically make a distinction between centralised and distributed version control systems.

Centralised Version Control System (CVCS)

In centralized version control systems, each user typically gets his or her own working copy, but there is only one central repository, often located on remote server. As soon as one commits, it is possible for the other developers to update and to see the changes. To check who made the changes and what the change were, users need to update the centralized server after executing a commit. The centralized server contains all the versioned files and number of developers that checked out files from that central place.

CVCS

However, there are downsides to this. Firstly, the most obvious problem is the single point of failure that the centralized server represents. If the server goes down, during that duration nobody can collaborate or save changes. Secondly, another major issue is the hard disk. If one has the entire history saved in one local folder, he risks to lose everthing if the system fails or crashes.

Distributed Version Control Systems (DVCS)

DVCS

In distributed version control systems, users get their own repository and working copy. Unlike a centralized version control system where working on a single server presents a major risk for the project development, distributed version systems stores in each users' local repository the full history of the file. Thus, if any server fails, edited repositories may be copied and restored back on the server.

To view modifications in a file, there are 4 steps one needs to execute. First, you will need to make a commit. At this stage, others still have no access to the changes made until you push your changes to the central repository. When you make the update, do not get others' changes unless you have first pulled those changes into your repository. Since the system is distributed, e.g. each developer gets their own local repository, nearly every operation can be done offline at incredible speed. This means that you can do commits, branches, merges, etc. file annotation entirely offline and generally instantly.

Advantages and Disadvantages of CVCS and DVCS

Features DVCS CVCS
Users can work productively when not connected to a network Yes No
Common operations such as commits, reverting changes, etc. are faster Yes No
Users can use the changes they do not want to publish Yes No
Initial checkout of a repository is slower (since all branches have to be copied in each local repository) No Yes
Additional storage required for every user to have a complete copy of the codebase history Yes No
Working copies are effectively remote backups Yes No
Various development models can be used Yes No
Common operations such as commits, reverting changes, etc. are faster Yes No

Installing Git

Installing Git is not complicated. Access the homepage and look for the rubrique "Download". You then have to select the OS you are working in.

Git Basics

We previously introduced Git as a distributed version control system. This means that it allows users to efficiently collaborate on a certain project. It is also able to perform actions extremely fast as Git only needs to access the hard drive. With all this information, you may still wonder what the expectations of Git developers were when developing this platform. We list 5 important expectations:

  • Speed.
  • Simple design.
  • Strong support for non-linear development.
  • Fully distributed.
  • Able to handle large projects efficiently.

Speed

Having a tool capable to rapidly take account of modifications in the file makes the collaboration easier. Compared to other systems, Git is often praised for its speed. The major difference between Git and any other VCS is the way Git thinks about data. Most systems view data of a set of files and changes made to each file. However, Git thinks about information as snapshot. When a developer changes a file, Git does not store again the file. Rather, it looks up the file stored in your computer and compares it with an updated file. The difference between the old and new file is the change. By then, Git does not have to ask a remote server to do it what drastically increases its speed.

Simple design

For many Git beginners, Git is a difficult to apprehend. And in fact, it is, especially if you are a windows user since Git provides its best support for Linux, then Mac. You will have to learn and understand a lof of new notions and definitions. However, after doing this, you will probably have a better understanding of Git functions and Git mechanisms. A basic knowledge of git functions is also requiered. The command syntax is also complex and sometimes unusual names. However, once you have mastered it, you should realise that Git is a quite user-friendly program that allows you to efficiently structure your project.

Strong support for non-linear development (branches)

A central feature of Git is branching. In Git, you can create a new local branch for everything you work on. The new local branch is a minor branch that is connected to the mainline, aka master branch. For each feature, each idea or bugfix,you can easily create a new branch, do a few commits on that branch and then merge it into your master branch or throw it away. You don’t have to mess up the master branch just to save or test your experimental ideas.

Fully distributed

In this context, fully distributed means that every developer has their own repository that has the entire commit history of the project. A central property in distributed version control systems.

Able to handle large projects efficiently

Git has some extensive functions to deal with large repositories with a very long history. Two solutions to deal with large repositories are presented by the Atlassian blog:

  • The git shallow clone: Instead of loading the whole history of the repository, we decide to pull down the latest n commits of the history.
  • The git filter: The command allows you to reduce branch complexity by deleting or modyfing some branches in the tree structure.

Git Characteristics

Git Has Integrity

It’s impossible to change the contents of any file or directory without Git detecting it. If a file is lost or information has been lost in a file, if a file get corrupted or if any change has happened, Git is able to detect it. This is due to the fact that Git every information in Git has a correspoding hash value.

Git Generally Only Adds Data

When you do actions in Git, nearly all of them only add data to the Git database. After you commit a snapshot into Git, it is very difficult to lose the information.

The Three States

You will ocassionaly hear about "the three stages" in Git. This simply refers to the possible stages of the file, e.g. commited, modified or staged.

Step-by-Step Github

Create a Repository

A repository is usually used to organize a single project. Repositories can contain folders and files, images, videos, spreadsheets, and data sets – anything your project needs. We recommend including a README, or a file with information about your project. GitHub makes it easy to add one at the same time you create your new repository. It also offers other common options such as a license file.

Your hello world repository can be a place where you store ideas, resources, or even share and discuss things with others.

To create a new repository:

  1. In the upper right corner, next to your avatar or identicon, click and then select New repository.
  2. Name your repository hello-world.
  3. Write a short description.
  4. Select Initialize this repository with a README.
  5. Click Create repository.

New Repository

Create a Branch/fork

Branching is the way to work on different versions of a repository at one time.

By default your repository has one branch named master which is considered to be the definitive branch. We use branches to experiment and make edits before committing them to master.

When you create a branch off the master branch, you’re making a copy, or snapshot, of master as it was at that point in time. If someone else made changes to the master branch while you were working on your branch, you could pull in those updates.

This diagram shows:

  • The master branch
  • A new branch called feature (because we’re doing ‘feature work’ on this branch)
  • The journey that feature takes before it’s merged into master

Branch

Have you ever saved different versions of a file? For instance:

  • story.txt
  • story-joe-edit.txt
  • story-joe-edit-reviewed.txt

Branches accomplish similar goals in GitHub repositories.

During a programming course, like this one, you can use branches for keeping bug fixes (improving/repairing code) and feature work (building new functions to an application) separate from our master (production, which contains all accepted side branches/forks) branch. When a change is ready, they merge their branch into master.

To create a new branch

  1. Go to your new repository hello-world.
  2. Click the drop down at the top of the file list that says branch: master.
  3. Type a branch name, readme-edits, into the new branch text box.
  4. Select the blue Create branch box or hit “Enter” on your keyboard.

Create Branch

Now you have two branches, master and readme-edits. They look exactly the same, but not for long! Next we’ll add our changes to the new branch.

Make and commit changes

Bravo! Now, you’re on the code view for your readme-edits branch, which is a copy of master. Let’s make some edits.

On GitHub, saved changes are called commits. Each commit has an associated commit message, which is a description explaining why a particular change was made. Commit messages capture the history of your changes, so other contributors can understand what you’ve done and why.

Make and commit changes

  1. Click the README.md file.
  2. Click the pencil icon in the upper right corner of the file view to edit.
  3. In the editor, write a bit about yourself.
  4. Write a commit message that describes your changes.
  5. Click Commit changes button.

Make and committ

These changes will be made to just the README file on your readme-editsbranch, so now this branch contains content that’s different from master.

Open a Pull Request

Nice edits! Now that you have changes in a branch off of master, you can open a pull request.

Pull Requests are the heart of collaboration on GitHub. When you open a pull request, you are requesting that the original author review your proposed changes and pull in your contribution and merge them into their branch. Pull requests show the differences between the content from both branches. The changes, additions, and subtractions are shown in green and red. As soon as you make a commit, you can open a pull request and start a discussion, even before the code is finished.

By using GitHub’s @mention system in your pull request message, you can ask for feedback from specific people or teams, whether they’re down the hall or 10 time zones away.

You can even open pull requests in your own repository and merge them yourself. It’s a great way to learn the GitHub flow before working on larger projects.

Open a Pull Request for changes to the README Create Pull Create Pull