How did you learn of the Data Umbrella scikit-learn sprints and what inspired you to attend?
I learned of the first Data Umbrella scikit-learn online sprint, which took place in June 2020, via Twitter. I was interested in contributing to open source and had already made one contribution to scikit-learn. However, when I started contributing to open source I didn’t have a network of like-minded people. I was very much looking forward to connecting with people who shared my interest in open source, data science, and scikit-learn, and to building a professional network in this field.
What was your experience at these sprints? What was your experience prior to that in contributing to open source?
scikit-learn is the first open source project I contributed to. As mentioned above, before my first Data Umbrella scikit-learn sprint I didn’t have a network. I found it difficult to find easy issues that would allow me to find my way into the project as many of them were taken very quickly. Together with the scikit-learn core developers, the Data Umbrella team curated a list of beginner friendly issues which were reserved for the sprint. This list of issues combined with the provided preparation material for the sprint helped me a lot with becoming an active member of the scikit-learn community.
My experience at the sprints has always been very positive. I was very happy to learn that the scikit-learn community is open, friendly, and welcoming to beginners and people from underrepresented groups in tech. I never felt afraid to ask questions and I always felt supported by the mentors.
Apart from feeling supported and welcomed, I also really enjoyed learning about my pair programming partners’ backgrounds and the challenges they face in their countries and continents. Living in Europe, I’m not very exposed to perspectives from people working in tech in Africa or Latin America, so their insights were very valuable to me and broadened my horizons.
What were the benefits of participating in these sprints?
A very important aspect to me is that through the Data Umbrella scikit-learn sprints I built a good professional network that I can trust. This makes me feel more welcomed and accepted in the tech world.
Another important aspect is that through the sprints I became more confident. Being supported and knowing that the sprints are a safe space helped me with developing my skills and knowledge. From the second online sprint onwards I participated as an assistant mentor by helping my respective pair programming partners with submitting a pull request. This not only improved my teaching skills, it was also always a good evidence for myself that I do advance in my abilities which pushed back my imposter syndrome.
Building on my gained confidence, I started organising monthly open source workshops as part of PyLadies Berlin. The aim of these workshops is to help more people from underrepresented groups in tech to contribute to open source projects and with that strengthening their professional portfolios and helping them with advancing in their career.
Last but not least the sprints helped me with standing out in job interviews. Contributions to open source projects, especially to well known and established libraries such as scikit-learn, help with setting you apart from the pool of job applicants. My contributions to the scikit-learn library helped me with getting job interviews and making a great impression on potential employers.
What, in particular, makes Data Umbrella and the sprints welcoming and accessible for you?
Prior to the sprints, comprehensive preparation material was sent out with specific and clear instructions. For me, this preparation material made contributing to open source a lot easier and I still refer to it on a regular basis.
I would also like to highlight that the sprints were very well organised and that this also made a big difference. Organising a sprint includes for example sending out preparation material on time, sending out reminders about the date and time, providing a pre-sprint preparation session where people can ask questions, setting up a discord server with different rooms and tables, getting the core developers on board for collecting issues of different skill levels that participants can work on and for mentoring and reviewing pull requests, providing a post-sprint follow-up session for further questions and getting pull requests reviewed by the core developers, having a code of conduct and enforcing it if necessary, and encouragement to stay active in the community and to keep contributing. All of these different tasks require a fair amount of work and preparation but they ensure that participants are able to successfully contribute to the project, feel welcome and supported, and leave the sprint with confidence and a sense of accomplishment.
What specific takeaways from the Data Umbrella sprints did you incorporate into your events?
I noticed that many people from underrepresented groups in tech are very eager and motivated to participate in sprints but struggle to continue contributing to open source projects when they’re on their own. Based on this I realised that having repeated events is important for keeping people engaged and for building a community. Through PyLadies Berlin I therefore set up open source hack nights on a monthly basis so that participants can ask questions if they’re stuck, connect with like-minded people who face the same challenges, and build a professional network.
Another takeaway for me was that the repeated Data Umbrella scikit-learn sprints helped me with developing my coding skills and, building on that, with making me feel more confident. Every time I was volunteering as an assistant mentor I saw how much my experience and knowledge had improved compared to the previous sprint and being able to help participants made me feel confident in my skills. I hope that with hosting repeated open source hack nights the participants will 1) also advance in their skills and knowledge, 2) become more confident over time, and 3) be able to independently contribute to open source.
A last and important takeaway is that creating a welcoming space makes people want to come back and it helps them thrive personally and professionally. I know many people from underrepresented groups in tech who are unhappy in their job because the culture at their company is not welcoming for everyone. Creating a space where everyone is accepted and respected and where it’s safe to ask questions keeps people engaged and I’m repeatedly impressed by the achievements of sprint participants, especially when they don’t have a formal computer science education.
Data Umbrella PyMC 2023 Open Source Report
Summary of the PyMC Open Source Working Session, March 2023