SATRDAY

The R community and some of South Africa's most forward thinking companies have come together to bring satRday back to Cape Town. This conference provides an opportunity to hear from and network with top Researchers, Data Scientists and Developers from the R community in South Africa and beyond.

Some pictures from Cape Town's second satRday in March 2018.

Speakers

Keynote Speakers

Maëlle Salmon
Data Scientist

Maëlle is a data scientist with experience mostly in public health, both in infectious disease and environmental epidemiology. She holds a PhD in statistics from the Ludwig Maximilians University of Munich and has lived in several European countries. She is a very enthusiastic R user and loves sharing her passion, with involvement in the community including blogging, co-founding the R-Ladies Barcelona meetup, package development and being a co-editor at rOpenSci, providing an onboarding system for R package developers.

A blog post that Maëlle wrote about her experience at satRday (Cape Town) 2018.

Stephanie Kovalchik
Data Scientist and researcher

Dr. Stephanie Kovalchik is the lead data scientist in the Game Insight Group at Tennis Australia, the governing body of tennis in Australia, and a Research Fellow in sports analytics at the Institute of Sport, Exercise and Active Living at Victoria University. Her research focuses on the use of statistical methods to understand performance, game strategy, and mentality in high-performance tennis. Stephanie received her PhD in statistics from UCLA and her Bachelor’s of Science from Caltech. Stephanie is an avid R user and the author of several packages, including RISmed (an R interface to NCBI databases) and deuce (a package for tennis analytics). She is also the creator of the tennis analytics blog On the T and regularly writes about tennis there and on Twitter.

A blog post that Stephanie wrote about her experience at satRday (Cape Town) 2018.

Speakers

Neil Rankin
Professor
Stellenbosch University
Peter Kamerman
Physiologist
University of the Witwatersrand
Michael Johnson
BI Architect
SQLSA
David Lubinsky
Managing Director
OPSI Systems
Vishalin Pillay
Data Analyst
Derivco
Schalk Heunis
Senior Data Scientist
Vodacom Big Data Analytics
Marc van Heerden
Advanced Analyst
Bain & Company
Wiebke Toussaint
Data Scientist
UCT Energy Research Centre
Sean Soutar
Student
University of Cape Town
Jasen Mackie
Product/Business Owner
IRESS SA Electronic Execution
Jed Stephens
Student
University of Cape Town
Wasim Lorgat
Student
University of Cape Town
Robert Bennetto
Head of Consulting
Pivot Sciences
Deveshnie Mudaly
Data Analyst
Derivco
Hanjo Odendaal
Data Scientist
Stellenbosch, BER, Eighty20
Andrew Clark
Hobbyist
Freelancer
Alice Elizabeth Coyne
Analyst
Gen Re
Christopher Waspe
Student
University of KwaZulu Natal
Naas Van Heerden
CEO and Co-founder
The Profit Table
Neil Watson
Lecturer
Department of Statistical Sciences
Maphale Matlala
Ecosystem Classification scientist
SANBI

Workshops

The satRday Cape Town conference will kick off on the 16th of March 2018 with a day of workshops held by our Keynote Speakers. Note that these will be full day workshops and will take place in parallel, so you'll have to choose between them!

R package development

...from laying functions to package hatching!

Maëlle Salmon

This hands-on workshop will allow you to transform a bunch of R functions into an R package. You'll also learn how to make it usable and used! We'll assume you can write functions, and a basic familiarity with Git & GitHub.

Here is what we'll learn:

  • Why develop a package?
  • What is in a package?
  • devtools workflow to create a package
  • Automatic tools for improving a package, from R CMD Check to linting
  • Creating a slick documentation website with pkgdown
  • Why and how to have your package reviewed?
  • How to make your package famous or at least reach your audience?
  • Package analytics via CRAN logs and via the gh package.

Attendees should bring a laptop with the latest versions of R and RStudio installed. Also bring your experience and questions about packages!

The Sport Statistician’s Toolbox in R

Stephanie Kovalchik

The workshop will cover a number of skills and statistical models that are common in sports statistics and show how each can be implemented in R. The workshop will introduce participants to a range of R packages and real sports examples will be used throughout. It will be very hands-on!

After completing this workshop, participants will be able to gather and clean public sports data more effectively, explore data with graphics, apply several common models used in sport, and share their sports statistics ideas through a blog. A brief outline of the workshop is given below.

Workshop Outline:

  • Sports data resources
  • Web scraping
  • Data exploration and validation
  • Sports models and real applications
    • Bradley-Terry paired comparison models.
      Application: Which team is the strongest in the English Football Association League?
    • Pythagorean Theorem.
      Application: How can we predict an NBA teams expected wins for the season?
    • Forecasting.
      Application: What is the chance that a tennis serve will be a service winner?
    • Generalized Additive Models.
      Application: How can we model the strike zone from baseball tracking data?
  • Sports Blogging

Further details and instructions will be sent to participants prior to the workshop.

Data Carpentry Workshop for R novices

This year we will offer a Data Carpentry workshop which will include R programming for novices. Data Carpentry workshops teach introductory computational skills needed for data management and analysis in all domains of research. For more information on syllabus and requirements please visit the workshop webpage.

The Data Carpentry Workshop will take place on the 15th and 16th March 2017. Note the workshop is sold out.

More Info

Programme

Workshops Programme

StartEndFriday 16 March 2018
8:30 9:00
Registration
9:00 10:30
First Session
10:30 11:00
Tea / Coffee
11:00 12:30
Second Session
12:30 13:30
Lunch
13:30 15:00
Third Session
15:00 15:30
Tea / Coffee
15:30 17:00
Fourth Session

Conference Programme

Standard talks are 20 minutes and lightning () talks are a mere 5 minutes.

Click on the title for any talk to view the details.

StartEndSaturday 17 March 2018
8:00 8:30
Registration
8:30 8:40
Welcome
8:40 9:30
Wiebke Toussaint
  • Tidy geometries in R (Robert Bennetto)

    R provides several packages to help wrangle geometries. Many of these geometry packages precede the tidyverse - meaning they are almost impossible to use correctly with the tidyverse. This talk will walk you through the principles of using `sf` - a spatial package that is tidy compliant.

  • Presenting the R package FlowCAr: Flow Network Construction and Analysis in R (Christopher Waspe)

    FlowCAr is a R package which allows the user to understand the 'ins and outs' of any flow network. With limited data, a complex flow network can be accurately modeled, visualized and analysed. Using various sources of a data, the unknown information in a system can be solved and the system analysed.

  • Automated Report Generation with R (Vishalin Pillay)

    We're often guilty of working in front of our noses, scripting in the 'here and now' and recycling this habit instead of the code. You don't need a seance to glimpse a little of how future work could be spared by generalizing your analysis templates and injecting some dynamism into the flow.

  • Testing the IUCN Red List for Ecosystems methodology on South African datasets (Maphale Matlala)

    My project is focused on comparing two of South African and IUCN RLE criteria that assesses spatial symptoms of ecosystem collapse. The redlistr package is used to conduct these assessments and non-parametric test employed to determine if there significant difference between the two methodologies.

9:30 10:30
Keynote #1: (Maëlle Salmon)
Katrin Tirok
  • Our packages reviews in review: introducting and analyzing rOpenSci onboarding system
  • rOpenSci is a community of researchers and software developpers working together to provide better R tooling, mostly packages, for reproducible and open science. Some of these packages are contributed by staff, others by community members. In order to ensure quality of packages in the suite, rOpenSci _onboards_ packages from the community by having them undergo an open review process on GitHub. In this talk, you'll get introduced to this onboarding system. You'll hear about brand-new analyses of the system in R thanks to my [rectangling](https://www.youtube.com/watch?v=GapSskrtUzU) onboarding. You'll also get to learn about our continuous efforts at improving our system even more, in particular via automation. Come hear this talk if you want to know more about software review and about Git(Hub) data analysis!

10:30 11:00
Coffee/Tea
11:00 12:30
Frederik Louw
  • Making your exploratory data analysis purrr (Peter Kamerman)
  • If you find digging into nested and repeated measures data tedious and overly repetitive, list-columns and the power of purrr can alleviate the pain.

  • Gathering Data from Dynamic Web Forms using RSelenium and Docker (Sean Soutar)

    Scraping data can be difficult especially when dealing with dynamic forms. I present a case study of how I used RSelenium and Docker to simulate a user interacting with a ASP.NET form. This was done on the South African Reserve Bank’s online statistical query to automate file downloads.

  • Futurology! (Alice Elizabeth Coyne)

    "Futurology (noun) - The study or forecasting of trends in science, technology, or social structure, etc. Life is all about the extremes! Even right now, Cape Town is in a drought. Not just a normal drought, an extreme drought! This talk looks at various techniques used to predict extremes."

  • Productivity Hacks (Andrew Collier)

    Do more data science, faster! How can you be more productive, and have more fun in the process?

  • My Journey From Python to R (featuring Dota 2) (Wasim Lorgat)

    This talk is a lightning-documentation of my (open and reproducible) journey from Python to R. To make it more interesting, the learning process is centered around mini-projects in analysing data from Dota 2 games (one of the most popular eSports games and a personal favourite)!

  • R is for Racing (Schalk Heunis)

    Using R to train the ConvNet of a self-driving RC car. The car uses a RaspberryPi to stream images into a Keras based model that outputs direction and speed in real-time. R was used to visualize and augment training data, train the model and visualize the results and iteratively improve performance.

  • R in strategy consulting (Marc van Heerden)

    Discussing how R is well suited to the fast-paced, collaborative and distributed needs of modern strategy consulting as well as how R is used at Bain & Company and where it fits in with the consultant/supporting staff’s toolset.

12:30 13:30
Lunch
13:30 14:20
Marie Dussault
  • Identifying bias in South African graduate recruitment (Neil Rankin)

    There is a lot of evidence that gender and racial biases affect hiring. In South Africa the Employment Equity Act aims to correct this. We use a dataset of graduate applications to show that males and white candidates are favoured by recruiters, and discuss what leap.ly is doing to address this.

  • Exact Factors - Improving Haven's Exporting from R to Stata (Jed Stephens)

    "This talk introduces a suite of functions that extend Wickham's Haven package to provide the user with specific levels for factors - a critical aspect of Stata-ready data.

    Haven's current exports have arbitrary factor levels which are unacceptable especially with survey data."

  • Using SQL from R: A short introduction (Deveshnie Mudaly)

    The sqldf package can be used to execute SQL statements in R. For those competent in SQL, this function is extremely useful. SQL is a powerful language and, when used in conjunction with R, can enhance and improve the analysis of data. I have made use of SQL to simplify my data frames quickly.

  • Modeling game-play momentum in Rugby Union with RShiny (Neil Watson)

    How do situational variables like field position, current score, and time remaining influence a team's momentum? We combine data wrangling and visualization with rich historical game data to produce a momentum 'map' that provides decision support in the form of insight into the best actions to take.

14:20 15:20
Keynote #2: (Stephanie Kovalchik)
Theoni Photopoulou
  • An R Engine for Real-Time Sports Analytics
  • The Game Insight Group (GIG) at Tennis Australia is aiming to revolutionize statistical thinking in tennis. To meet that goal, we are developing new performance metrics and providing these to players, broadcasters, and fans in real-time at professional tennis events. R is at the core our our analytics tools. In this talk, I will describe how we are using R to design real-time applications for delivering advanced sports statistics to a range of stakeholders. The talk will highlight strategies for developing efficient code, automating processes, and creating customized statistical reports in R markdown.

15:20 15:50
Coffee/Tea
15:50 17:00
Takwanisa Machemedze
  • Listening to the news: Big data for consumer sentiment analysis (Hanjo Odendaal)

    Currently we are researching on how to incorporate a new form of information, textual data, into the already existing mixed-frequency econometric framework. Can we use alternative secondary data sources to construct a consumer sentiment index that captures perceived economic outlook. I believe it so

  • Why build Models as Apps? (Naas van Heerden)

    The talk makes the case for building models as dynamic apps. R + package development + Shiny have massive potential as an alternative to traditional static spreadsheets models. Included will be an illustration using a commercially marketed Shiny application that models loan profitability.

  • ckan + R: workflows for data retrieval and archiving (Wiebke Toussaint)

    Ever wished you could suck that cited dataset directly into your models? ckanr makes it possible. No downloads, no convoluted folders, no lost data. Learn to create workflows that combine the power of CKAN, a leading data portal platform, with R to make data archiving and retrieval a breeze.

  • Simulating Trading Strategies for Testing Skill vs Luck or Overfitting (Jasen Mackie)

    Simulating the P&L of quant trading strategies can shed light on the dynamics of a strategy, like confidence intervals of max drawdown, but simulating individual trade entries and exits can uncover more insights ultimately discerning skill vs luck or overfitting. Introducing txnsim() in blotter!

  • Creating interactive plots in five minutes with datapasta and plotly (Andrew Clark)

    Ever seen a table on the web and thought "I wonder what that would look like as a chart" or "I really need to impress my boss in the next 15 minutes"

    Here is how to turn a dry set of figures into a compelling visual. Just copy a table, paste it into R, add a few lines of code and voila!"

  • The R profiler: R's best kept secret (David Lubinsky)

    Your R program is running too slowly and you don't know why. The RStudio profiler will tell you exactly where all the time is being spent along with useful visualisations. Profiling is an essential, but often overlooked, tool for every serious R developer's toolkit.

17:00 17:10
RStudio Student Lightning Talk Prize
17:10 17:15
Closing
17:30 19:00
Cocktails and Conversation

Panel for Student Lightning Talk competition:

  • Alice Coyne,
  • Jasen Mackie and
  • Marc van Heerden.

R-Ladies Event

When: March 15, 6:00 PM to 8:30 PM.

Where: Bandwidth Barn,Woodstock Exchange, 66-68 Albert Road, Cape Town

Both ​satRday ​keynote speakers​, ​Maëlle Salmon and Stephanie Kovalchik​,​ are involved in R-Ladies​. This is​ a worldwide organisation whose mission is to promote gender diversity in the R community, with chapters in cities all over the world.

The R-Ladies Cape Town chapter has invited Maëlle and Stephanie, as well as Marie Dussault​ from Pretoria​, ​to come and meet their chapter. Maëlle, Stephanie and Marie have offered to give short presentations about creating blogs with R​.

The event is primarily aimed at women (including all female-identifying people) but men are welcome provided they do not take a leadership role.

This meet-up will provide a great opportunity for all current and prospective R users to meet and chat with like minded people. We'd love to see you there, especially if this is your first R-Ladies event.​ This invite is still extended even if you aren't participating in satRday this year.

If you are interested attending similar events in the future please get in contact with us via our Meetup page or our Twitter account.

Code of Conduct

satRday is dedicated to providing a harassment-free and inclusive conference experience for all in attendance regardless of, but not limited to, gender, sexual orientation, disabilities, physical attributes, age, ethnicity, social standing, religion or political affiliation.

We do not tolerate harassment of participants (including organisers and vendors) in any form. Sexual innuendos and imagery are not appropriate for any conference venue, including presentations.

Anyone violating these rules may be given warning or expelled from the conference (without a refund) at the discretion of the conference organisers.

Our code of conduct/anti-harassment policy can be found here.

Venue

Situated on the slopes of Devil's Peak, the University of Cape Town (UCT) is South Africa's oldest university, and one of Africa's leading teaching and research institutions.

UCT Upper Campus

UCT's new lecture theatre, located at the southern end of University Avenue, is the largest venue on campus and the first to earn a four-star green rating.


UCT New Lecture Theatre outdoor area

UCT New Lecture Theatre amphitheatre

UCT New Lecture Theatre cafe

Locations

Data Carpentry Workshop:

MAC Club, Barnard Fuller Building, UCT Health Sciences Campus

Sports Workshop:

Ulwazi, Knowledge Commons, Chancellor Oppenheimer Library

Packages Workshop:

Hlanganani Junction, Level 7, Chancellor Oppenheimer Library

satRday Conference:

New Lecture Theatre, UCT Upper Campus (South side)


PARKING (Workshop attendees only)

If you are not a UCT affiliate with a parking disc, you will need to get to UCT Upper Campus early (8am), and stop at the visitor’s centre (see above map) to get a visitor’s parking slip. Once received, you can drive the loop along the rugby field and up Madiba Circle, to find a visitor’s parking bay. The above map also shows where the library is situated and the red line indicates the direction you should enter from.

LIBRARY VENUE DIRECTIONS (Workshop attendees only)

Both workshop venues are situated in the UCT Main Library (Chancellor Oppenheimer Library). Entrance to the Library is through the first building directly to the right of Jameson Hall (if you are facing the Hall) at the top of the Jammie stairs (located at the centre of Upper Campus).

If you are not a UCT affiliate, please tell security at the entrance to the library that you are attending an R workshop, and show them identification (ID/drivers license).

Once you enter the turnstiles, Ulwazi is directly to your right (straight across the Vincent Kolbe Knowledge Commons).

Hlanganani Junction is two floors directly above the entrance. As you enter the turnstiles take the stairs to your left and continue ascending three small flights until you notice the purple sign overhead, indicating Hlanganani Junction to your right.

Travel

Located in the heart of the Southern Suburbs, UCT is within easy reach of transport and accommodation.

Registration

  • Standard price for Conference and Workshop passes will be R500 and R1500 respectively.
  • No late or at the door registration available.

    Workshop Pass

    A workshop pass gives you access to one of the workshops held by our Keynote Speakers. These full day workshops will be run in parallel on the 16th March 2018.

    Workshop Pass
    1 Day Workshop
    Lunch
    Networking breaks with refreshments provided
    Free 1 month trial for DataCamp

    Conference Pass

    The Conference Pass gives you access to South Africa's second satRday. Come and join us on the 17th March 2018 to meet with and hear from both local and international R enthusiasts!

    Conference Pass
    Morning Tutorials
    All Conference Talks
    Lunch
    Networking breaks with refreshments provided
    Free 1 month trial for DataCamp
  • Important Dates

    Working up to the conference on 17 March 2018, these are the most important dates on your calendar:

    Event Date
    Data Carpentry Workshop 2018-03-15/16
    satRday Workshops 2018-03-16
    Conference 2018-03-17

    Call for Papers

    The Call for Papers is now closed. We received 39 submissions and we're currently putting together the technical programme.

    What are the benefits of giving a presentation?

    • Speakers will have their registration fees refunded.
    • Financial support may be possible for speakers.
    • Prompt feedback on your proposed presentation.

    Sponsors

    We thank our generous sponsors for supporting Cape Town satRday. Your financial help and commitment to the R community are greatly appreciated and essential to making this event possible!