Dear students of the International Winter School on Software Engineering,
I am looking forward to see you all on Monday Feb. 14th for our session on “Global Software Engineering in Open Source and its Effect on Quality”.
To get the most out of this session, I would like that you prepare a bit.
Please read the paper Shihab et al., “The Effect of Branching Strategies on Software Quality” before Monday Feb. 14th.
We will start the session with a discussion of this paper. After a brief lecture on relevant topics, you will work collaboratively in groups of three on a mini research project inspired by the paper. That project will include scripted collection and processing of data. We will conclude with interpretation and discussion of our results.
Please have your computers ready with your favorite scripting language installed and setup. I will provide some reusable code examples (Python) in a virtual environment. That is, in case you do not have a favorite scripting language yet, you might want to choose Python ;)
For our mini research project, we will collect data from Git repositories and Jira issue trackers. You might want to familiarize yourself with:
- How to collect information from a Git repository.
- You might find online resources on how to parse output from the
git log
command for analysis. Some examples are given in these lecture notes, which can be executed in this virtual environment. - Alternatively, you might want to check PyDriller, a tool that facilitates analysis of Git repositories.
- You might find online resources on how to parse output from the
- How to send HTTP requests and process their responses, e.g., via the
requests
package. - How to handle timestamps that are provided as strings, e.g., via the
datetime
module and its functionsfromisoformat
orstrptime
. For the latter the corresponding time format codes are important. - How to plot scatter plots, e.g., with matplotlib’s
scatter
function
Best regards, Helge