Introduction to Shell for (Data) Scientists

KAUST Visualization Core Lab

17 February 2020

8:30 am - 5:00 pm

Instructors: David R. Pugh

Helpers: TBD

General Information

The Visualization Core Lab will host an Introduction to the Shell for Data Scientists workshop. Visualization lab staff will provide an introduction to shell commands designed for learners with little or no previous experience working with shell commands.

The Unix shell has been around longer than most of its users have been alive. It has survived so long because it’s a power tool that allows people to do complex things with just a few keystrokes. More importantly, it helps them combine existing programs in new ways and automate repetitive tasks so they aren’t typing the same things over and over again. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources with clusters either locally at KAUST (i.e., Ibex, Nesser, Shaheen, etc) or in the cloud ( GCP, AWS, Azure, etc).

Topics covered will include

This hands-on lesson is part of the Introduction to Data Science Workshop Series being offered by the KAUST Research Computing Core Labs as part of our on-going efforts to build capacity in core data science skills at KAUST. The workshop curriculum largely follows the curriculum developed by Software Carpentry, a volunteer project dedicated to helping researchers get their work done in less time and with less pain by teaching them basic research computing skills.

This is a live-coding based workshop and learners are expected to bring their own laptops with the required software already downloaded and installed.

For more information on what Software Carpentry teaches and why, please see their paper "Best Practices for Scientific Computing".

Who: The course is aimed at graduate students (MSc and PhD), Post-docs, faculty and other research staff at KAUST. You don't need to have any previous knowledge of the tools that will be presented at the workshop.

Where: Auditorium 215 (between bldg. 2-3, level 0). Get directions with OpenStreetMap or Google Maps.

When: 17 February 2020. Add to your Google Calendar.

Registration: Register Now!

Course Materials: Introduction to the Unix Shell for (Data) Scientists

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email david.pugh@kaust.edu.sa for more information.


Code of Conduct

Everyone who participates in Carpentries activities is required to conform to the Code of Conduct.This document also outlines how to report an incident if needed.


Collaborative Notes

We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.


Surveys

Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey


Schedule

Before Pre-workshop survey
09:00 Introduction to the Unix Shell
10:30 Morning break
10:45 Pipes and Filters
12:00 Lunch break
13:00 Loops
14:30 Afternoon break
14:45 Writing Shell scripts
16:30 Wrap-up
17:00 END
After Post-workshop survey

Syllabus

The Unix Shell

  • Files and Directories
  • History and Tab Completion
  • Pipes and Redirection
  • Looping Over Files
  • Creating and Running Shell Scripts
  • Finding Things
  • Reference...

Setup

To participate in a Software Carpentry workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

The Bash Shell

Bash is a commonly-used shell that gives you the power to do simple tasks more quickly.

Video Tutorial
  1. Download the Git for Windows installer.
  2. Run the installer and follow the steps below:
    1. Click on "Next" four times (two times if you've previously installed Git). You don't need to change anything in the Information, location, components, and start menu screens.
    2. Select "Use the nano editor by default" and click on "Next".
    3. Keep "Use Git from the Windows Command Prompt" selected and click on "Next". If you forgot to do this programs that you need for the workshop will not work properly. If this happens rerun the installer and select the appropriate option.
    4. Click on "Next".
    5. Keep "Checkout Windows-style, commit Unix-style line endings" selected and click on "Next".
    6. Select "Use Windows' default console window" and click on "Next".
    7. Click on "Install".
    8. Click on "Finish".
  3. If your "HOME" environment variable is not set (or you don't know what this is):
    1. Open command prompt (Open Start Menu then type cmd and press [Enter])
    2. Type the following line into the command prompt window exactly as shown:

      setx HOME "%USERPROFILE%"

    3. Press [Enter], you should see SUCCESS: Specified value was saved.
    4. Quit command prompt by typing exit then pressing [Enter]

This will provide you with both Git and Bash in the Git Bash program.

The default shell in all versions of macOS is Bash, so no need to install anything. You access Bash from the Terminal (found in /Applications/Utilities). See the Git installation video tutorial for an example on how to open the Terminal. You may want to keep Terminal in your dock for this workshop.

The default shell is usually Bash, but if your machine is set up differently you can run it by opening a terminal and typing bash. There is no need to install anything.