SATRDAY
The R community and some of South Africa's most forward thinking companies have come together to bring satRday back to Cape Town. This conference provides an opportunity to hear from and network with top Researchers, Data Scientists and Developers from the R community in South Africa and beyond.
Some pictures from Cape Town's second satRday in March 2018.
Speakers
Keynote Speakers
Maëlle Salmon
Data Scientist
Maëlle is a data scientist with experience mostly in public health, both in infectious disease and environmental epidemiology. She holds a PhD in statistics from the Ludwig Maximilians University of Munich and has lived in several European countries. She is a very enthusiastic R user and loves sharing her passion, with involvement in the community including blogging, co-founding the R-Ladies Barcelona meetup, package development and being a co-editor at rOpenSci, providing an onboarding system for R package developers.
A blog post that Maëlle wrote about her experience at satRday (Cape Town) 2018.
Stephanie Kovalchik
Data Scientist and researcher
Dr. Stephanie Kovalchik is the lead data scientist in the Game Insight Group at Tennis Australia, the governing body of tennis in Australia, and a Research Fellow in sports
analytics at the Institute of Sport, Exercise and Active Living at Victoria University. Her
research focuses on the use of statistical methods to understand performance, game
strategy, and mentality in high-performance tennis. Stephanie received her PhD in
statistics from UCLA and her Bachelor’s of Science from Caltech.
Stephanie is an avid R user and the author of several packages, including RISmed (an R interface to NCBI databases) and deuce (a package for tennis analytics). She is also the creator of the tennis analytics blog On the T and regularly writes about tennis there and on Twitter.
A blog post that Stephanie wrote about her experience at satRday (Cape Town) 2018.
Speakers
Neil Rankin
Professor
Stellenbosch University
Peter Kamerman
Physiologist
University of the Witwatersrand
Michael Johnson
BI Architect
SQLSA
David Lubinsky
Managing Director
OPSI Systems
Vishalin Pillay
Data Analyst
Derivco
Schalk Heunis
Senior Data Scientist
Vodacom Big Data Analytics
Marc van Heerden
Advanced Analyst
Bain & Company
Wiebke Toussaint
Data Scientist
UCT Energy Research Centre
Sean Soutar
Student
University of Cape Town
Jasen Mackie
Product/Business Owner
IRESS SA Electronic Execution
Jed Stephens
Student
University of Cape Town
Wasim Lorgat
Student
University of Cape Town
Robert Bennetto
Head of Consulting
Pivot Sciences
Deveshnie Mudaly
Data Analyst
Derivco
Hanjo Odendaal
Data Scientist
Stellenbosch, BER, Eighty20
Andrew Clark
Hobbyist
Freelancer
Alice Elizabeth Coyne
Analyst
Gen Re
Christopher Waspe
Student
University of KwaZulu Natal
Naas Van Heerden
CEO and Co-founder
The Profit Table
Neil Watson
Lecturer
Department of Statistical Sciences
Maphale Matlala
Ecosystem Classification scientist
SANBI
Workshops
The satRday Cape Town conference will kick off on the
16th of March 2018 with a day of workshops held by our Keynote
Speakers. Note that these will be full day workshops and will take place in
parallel, so you'll have to choose between them!
R package development
...from laying functions to package hatching!
Maëlle Salmon
This hands-on workshop will allow you to transform a bunch of R functions
into an R package. You'll also learn how to make it usable and used! We'll
assume you can write functions, and a basic familiarity with
Git &
GitHub.
Here is what we'll learn:
- Why develop a package?
- What is in a package?
- devtools workflow to create a package
- Automatic tools for improving a package, from R CMD Check to linting
- Creating a slick documentation website with pkgdown
- Why and how to have your package reviewed?
- How to make your package famous or at least reach your audience?
- Package analytics via CRAN logs and via the gh package.
Attendees should bring a laptop with the latest versions of
R and
RStudio
installed. Also bring your experience and questions about packages!
The Sport Statistician’s Toolbox in R
Stephanie Kovalchik
The workshop will cover a number of skills and statistical models that
are common in sports statistics and show how each can be implemented in R.
The workshop will introduce participants to a range of R packages and real
sports examples will be used throughout. It will be very hands-on!
After completing this workshop, participants will be able to gather
and clean public sports data more effectively, explore data with
graphics, apply several common models used in sport, and share their sports
statistics ideas through a blog. A brief outline of the workshop is given
below.
Workshop Outline:
- Sports data resources
- Web scraping
- Data exploration and validation
- Sports models and real applications
- Bradley-Terry paired comparison models.
Application: Which team is the strongest in the English Football Association
League?
- Pythagorean Theorem.
Application: How can we predict an NBA teams expected wins for the
season?
- Forecasting.
Application: What is the chance that a tennis serve will be a
service winner?
- Generalized Additive Models.
Application: How can we model the strike zone from baseball tracking
data?
- Sports Blogging
Further details and instructions will be sent to participants prior to
the workshop.
Data Carpentry Workshop for R novices
This year we will offer a Data Carpentry workshop which will include R
programming for novices.
Data Carpentry workshops teach
introductory computational skills needed for data management and analysis in
all domains of research. For more information on syllabus and requirements
please visit the
workshop webpage.
The Data Carpentry Workshop will take place on the 15th and 16th March
2017.
Note the workshop is sold out.
Programme
Workshops Programme
Start | End | Friday 16 March 2018 |
8:30 |
9:00 |
Registration |
9:00 |
10:30 |
First Session |
10:30 |
11:00 |
Tea / Coffee
|
11:00 |
12:30 |
Second Session
|
12:30 |
13:30 |
Lunch
|
13:30 |
15:00 |
Third Session |
15:00 |
15:30 |
Tea / Coffee
|
15:30 |
17:00 |
Fourth Session
|
Conference Programme
Standard talks are 20 minutes and lightning () talks are a mere 5 minutes.
Click on the title for any talk to view the details.
Start | End | Saturday 17 March 2018 |
8:00 |
8:30 |
Registration |
8:30 |
8:40 |
Welcome |
8:40 |
9:30 |
Wiebke Toussaint
- Tidy geometries in R (Robert Bennetto)
R provides several packages to help wrangle geometries. Many of these geometry packages precede the tidyverse - meaning they are almost impossible to use correctly with the tidyverse. This talk will walk you through the principles of using `sf` - a spatial package that is tidy compliant.
- Presenting the R package FlowCAr: Flow Network Construction and Analysis in R (Christopher Waspe)
FlowCAr is a R package which allows the user to understand the 'ins and outs' of any flow network. With limited data, a complex flow network can be accurately modeled, visualized and analysed. Using various sources of a data, the unknown information in a system can be solved and the system analysed.
- Automated Report Generation with R (Vishalin Pillay)
We're often guilty of working in front of our noses, scripting in the 'here and now' and recycling this habit instead of the code. You don't need a seance to glimpse a little of how future work could be spared by generalizing your analysis templates and injecting some dynamism into the flow.
- Testing the IUCN Red List for Ecosystems methodology on South African datasets (Maphale Matlala)
My project is focused on comparing two of South African and IUCN RLE criteria that assesses spatial symptoms of ecosystem collapse. The redlistr package is used to conduct these assessments and non-parametric test employed to determine if there significant difference between the two methodologies.
|
9:30 |
10:30 |
Keynote #1: (Maëlle Salmon)
Katrin Tirok
- Our packages reviews in review: introducting and analyzing rOpenSci onboarding system
rOpenSci is a community of researchers and software developpers working together to provide better R tooling, mostly packages, for reproducible and open science. Some of these packages are contributed by staff, others by community members. In order to ensure quality of packages in the suite, rOpenSci _onboards_ packages from the community by having them undergo an open review process on GitHub. In this talk, you'll get introduced to this onboarding system. You'll hear about brand-new analyses of the system in R thanks to my [rectangling](https://www.youtube.com/watch?v=GapSskrtUzU) onboarding. You'll also get to learn about our continuous efforts at improving our system even more, in particular via automation. Come hear this talk if you want to know more about software review and about Git(Hub) data analysis!
|
10:30 |
11:00 |
Coffee/Tea
|
11:00 |
12:30 |
Frederik Louw
|
12:30 |
13:30 |
Lunch
|
13:30 |
14:20 |
Marie Dussault
- Identifying bias in South African graduate recruitment (Neil Rankin)
There is a lot of evidence that gender and racial biases affect hiring. In South Africa the Employment Equity Act aims to correct this. We use a dataset of graduate applications to show that males and white candidates are favoured by recruiters, and discuss what leap.ly is doing to address this.
- Exact Factors - Improving Haven's Exporting from R to Stata (Jed Stephens)
"This talk introduces a suite of functions that extend Wickham's Haven package to provide the user with specific levels for factors - a critical aspect of Stata-ready data. Haven's current exports have arbitrary factor levels which are unacceptable especially with survey data."
- Using SQL from R: A short introduction (Deveshnie Mudaly)
The sqldf package can be used to execute SQL statements in R. For those competent in SQL, this function is extremely useful. SQL is a powerful language and, when used in conjunction with R, can enhance and improve the analysis of data. I have made use of SQL to simplify my data frames quickly.
- Modeling game-play momentum in Rugby Union with RShiny (Neil Watson)
How do situational variables like field position, current score, and time remaining influence a team's momentum? We combine data wrangling and visualization with rich historical game data to produce a momentum 'map' that provides decision support in the form of insight into the best actions to take.
|
14:20 |
15:20 |
Keynote #2: (Stephanie Kovalchik)
Theoni Photopoulou
- An R Engine for Real-Time Sports Analytics
The Game Insight Group (GIG) at Tennis Australia is aiming to revolutionize statistical thinking in tennis. To meet that goal, we are developing new performance metrics and providing these to players, broadcasters, and fans in real-time at professional tennis events. R is at the core our our analytics tools. In this talk, I will describe how we are using R to design real-time applications for delivering advanced sports statistics to a range of stakeholders. The talk will highlight strategies for developing efficient code, automating processes, and creating customized statistical reports in R markdown.
|
15:20 |
15:50 |
Coffee/Tea
|
15:50 |
17:00 |
Takwanisa Machemedze
- Listening to the news: Big data for consumer sentiment analysis (Hanjo Odendaal)
Currently we are researching on how to incorporate a new form of information, textual data, into the already existing mixed-frequency econometric framework. Can we use alternative secondary data sources to construct a consumer sentiment index that captures perceived economic outlook. I believe it so
- Why build Models as Apps? (Naas van Heerden)
The talk makes the case for building models as dynamic apps. R + package development + Shiny have massive potential as an alternative to traditional static spreadsheets models. Included will be an illustration using a commercially marketed Shiny application that models loan profitability.
- ckan + R: workflows for data retrieval and archiving (Wiebke Toussaint)
Ever wished you could suck that cited dataset directly into your models? ckanr makes it possible. No downloads, no convoluted folders, no lost data. Learn to create workflows that combine the power of CKAN, a leading data portal platform, with R to make data archiving and retrieval a breeze.
- Simulating Trading Strategies for Testing Skill vs Luck or Overfitting (Jasen Mackie)
Simulating the P&L of quant trading strategies can shed light on the dynamics of a strategy, like confidence intervals of max drawdown, but simulating individual trade entries and exits can uncover more insights ultimately discerning skill vs luck or overfitting. Introducing txnsim() in blotter!
- Creating interactive plots in five minutes with datapasta and plotly (Andrew Clark)
Ever seen a table on the web and thought "I wonder what that would look like as a chart" or
"I really need to impress my boss in the next 15 minutes"
Here is how to turn a dry set of figures into a compelling visual. Just copy a table, paste it into R, add a few lines of code and voila!"
- The R profiler: R's best kept secret (David Lubinsky)
Your R program is running too slowly and you don't know why. The RStudio profiler will tell you exactly where all the time is being spent along with useful visualisations. Profiling is an essential, but often overlooked, tool for every serious R developer's toolkit.
|
17:00 |
17:10 |
RStudio Student Lightning Talk Prize
|
17:10 |
17:15 |
Closing
|
17:30 |
19:00 |
Cocktails and Conversation
|
Panel for Student Lightning Talk competition:
- Alice Coyne,
- Jasen Mackie and
- Marc van Heerden.
R-Ladies Event
When: March 15, 6:00 PM to 8:30 PM.
Where: Bandwidth Barn,Woodstock Exchange, 66-68 Albert Road, Cape Town
Both satRday keynote speakers, Maëlle Salmon and Stephanie Kovalchik, are involved in R-Ladies. This is a worldwide organisation whose mission is to promote gender diversity in the R community, with chapters in cities all over the world.
The R-Ladies Cape Town chapter has invited Maëlle and Stephanie, as well as Marie Dussault from Pretoria, to come and meet their chapter. Maëlle, Stephanie and Marie have offered to give short presentations about creating blogs with R.
The event is primarily aimed at women (including all female-identifying people) but men are welcome provided they do not take a leadership role.
This meet-up will provide a great opportunity for all current and prospective R users to meet and chat with like minded people. We'd love to see you there, especially if this is your first R-Ladies event. This invite is still extended even if you aren't participating in satRday this year.
If you are interested attending similar events in the future please get in contact with us via our Meetup page or our Twitter account.
Code of Conduct
satRday is dedicated to providing a harassment-free and inclusive conference experience for all in attendance regardless of, but not limited to, gender, sexual orientation, disabilities, physical attributes, age, ethnicity, social standing, religion or political affiliation.
We do not tolerate harassment of participants (including organisers and vendors) in any form. Sexual innuendos and imagery are not appropriate for any conference venue, including presentations.
Anyone violating these rules may be given warning or expelled from the conference (without a refund) at the discretion of the conference organisers.
Our code of conduct/anti-harassment policy can be found here.
Venue
Situated on the slopes of Devil's Peak, the University of Cape Town (UCT) is South Africa's oldest university, and one of Africa's leading teaching and research institutions.
UCT's new lecture theatre, located at the southern end of University Avenue,
is the largest venue on campus and the first to earn a four-star green rating.
Locations
Data Carpentry Workshop:
MAC Club, Barnard Fuller Building, UCT Health Sciences Campus
Sports Workshop:
Ulwazi, Knowledge Commons, Chancellor Oppenheimer Library
Packages Workshop:
Hlanganani Junction, Level 7, Chancellor Oppenheimer Library
satRday Conference:
New Lecture Theatre, UCT Upper Campus (South side)
Attendees of the Friday workshops should take note of the below details regarding parking and directions to the workshop venues in the library.
PARKING (Workshop attendees only)
If you are not a UCT affiliate with a parking disc, you will need to get to UCT Upper Campus early (8am), and stop at the visitor’s centre (see above map) to get a visitor’s parking slip. Once received, you can drive the loop along the rugby field and up Madiba Circle, to find a visitor’s parking bay. The above map also shows where the library is situated and the red line indicates the direction you should enter from.
LIBRARY VENUE DIRECTIONS (Workshop attendees only)
Both workshop venues are situated in the UCT Main Library (Chancellor Oppenheimer Library). Entrance to the Library is through the first building directly to the right of Jameson Hall (if you are facing the Hall) at the top of the Jammie stairs (located at the centre of Upper Campus).
If you are not a UCT affiliate, please tell security at the entrance to the library that you are attending an R workshop, and show them identification (ID/drivers license).
Once you enter the turnstiles, Ulwazi is directly to your right (straight across the Vincent Kolbe Knowledge Commons).
Hlanganani Junction is two floors directly above the entrance. As you enter the turnstiles take the stairs to your left and continue ascending three small flights until you notice the purple sign overhead, indicating Hlanganani Junction to your right.
Travel
Located in the heart of the Southern Suburbs, UCT is within easy reach of transport and accommodation.
Registration
Standard price for Conference and Workshop passes will be R500 and R1500 respectively.
No late or at the door registration available.
A ticket to the conference or workshops will also give you one month free
access to
DataCamp!
Workshop Pass
A workshop pass gives you access to one of the
workshops held by our Keynote Speakers. These
full day workshops will be run in parallel on the 16th March
2018.
|
Workshop Pass
|
1 Day Workshop |
|
Lunch |
|
Networking breaks with refreshments provided |
|
Free 1 month trial for DataCamp |
|
Conference Pass
The Conference Pass gives you access to South Africa's second
satRday. Come and join us on the 17th March 2018
to meet with and hear from both local and international R enthusiasts!
|
Conference Pass |
Morning Tutorials |
|
All Conference Talks |
|
Lunch |
|
Networking breaks with refreshments provided |
|
Free 1 month trial for DataCamp |
|
Important Dates
Working up to the conference on 17 March 2018, these are the most important dates on your calendar:
Event |
Date |
Data Carpentry Workshop |
2018-03-15/16 |
satRday Workshops |
2018-03-16 |
Conference |
2018-03-17 |
Call for Papers
The Call for Papers is now closed. We received 39 submissions and we're currently putting together the technical programme.
What are the benefits of giving a presentation?
- Speakers will have their registration fees refunded.
- Financial support may be possible for speakers.
- Prompt feedback on your proposed presentation.