February 2018 – muellis blog

As in the last ten (or so) years I attended FODSEM, the biggest European Free Software event. This year, though, I went a day earlier to attend one of the fringe events, the CHAOSSCon.

I didn’t take notice of the LinuxFoundation announcing CHAOSS, an attempt to bundle various efforts regarding measuring and creating metrics of Open Source projects. The CHAOSS community is thus a bunch of formerly separate projects now having one umbrella.

OpenStack’s Ildiko Vancsa opened the conference by saying that metrics is what drives our understanding of communities and that we’re all interested in numbers. That helps us to understand how projects work and make a more educated guess how healthy a project currently is, and, more importantly, what needs to be done in order to make it more sustainable. She also said that two communities within the CHAOSS project exist: The Metrics and the Software team. The metrics care about what information should be extracted and how that can be presented in an informational manner. The Software team implements the extraction parts and makes the analytics. She pointed the audience to the Wiki which hosts more information.

Georg Link from the metrics team then continued saying that health cannot universally be determined as every project is different and needs a different perspective. The metrics team does not work at answering the health question for each and every project, but rather enables such conclusions to be drawn by providing the necessary infrastructure. They want to provide facts, not opinions.

Jesus from Bitergia and Harish from Red Hat were talking on behalf of the technical team. Their idea is to build a platform to understand how software is developed. The core projects are prospector, cregit, ghdata, and grimoire, they said.

I think that we in the GNOME community can use data to make more informed decisions. For example, right now we’re fading out our Bugzilla instance and we don’t really have any way to measure how successful we are. In fact, we don’t even know what it would mean to be successful. But by looking at data we might get a better feeling of what we are interested in and what metric we need to refine to express better what we want to know. Then we can evaluate measures by looking at the development of the metrics over time. Spontaneously, I can think of these relatively simple questions: How much review do our patches get? How many stale wiki links do we have? How soon are security issues being dealt with? Do people contribute to the wiki, documentation, or translations before creating code? Where do people contribute when coding stalls?

Bitergia’s Daniel reported on Diversity and Inclusion in CHAOSS and he said he is building a bridge between the metrics and the software team. He tried to produce data of how many women were contributing what. Especially, whether they would do any technical work. Questions they want to answer include whether minorities take more time to contribute or what impact programs like the GNOME Outreach Program for Women have. They do need to code up the relevant metrics but intend to be ready for the next OpenStack Gender diversity report.

Bitergia’s CEO talked about the state of the GriomoireLab suite.
It’s software development analysis toolkit written largely in Python, ElasticSearch, and Kibana. One year ago it was still complicated to run the stack, he said. Now it’s easy and organisations like the Document Foundation run run a public instance. Also because they want to be as transparent as possible, he said.

Yousef from Mozilla’s Open Innovation team then showed how they make use of Grimoire to investigate the state of their community. They ingest data from Github, Bugzilla, newsgroup, meetups, discourse, IRC, stackoverflow, their wiki, rust creates, and a few other things reaching back as far as 20 years. Quite impressive. One of the graphs he found interesting was one showing commits by time zone. He commented that it was not as diverse as he hope as there were still many US time zones and much fewer Asian ones.

Raymond from the Linux Foundation talked about Metrics in Open Source Communities, what are they measuring and what do they do with the data. Measuring things is not too complicated, he said. But then you actually need to do stuff with it. Certain things are simply hard to measure, he said. As an example he gave the level of user or community support people give. Another interesting aspect he mentioned is that it may be a very good thing when numbers go down, also because projects may follow a hype cycle, too. And if your numbers drop, it’ll eventually get to a more mature phase, he said. He closed with a quote he liked and noted that he’s not necessarily making fun of senior management: Not everything that can be counted counts and not everything that counts can be counted.

Boris then talked about Crossminer, which is a European funded research project. They aim for improving the management of software projects by providing in-context recommendations and analytics. It’s a continuation of the Ossmeter project. He said that such projects usually die after the funding runs out. He said that the Crossminer project wants to be sustainable and survive the post-funding state by building an actual community around the software the project is developing. He presented a rather high level overview of what they are doing and what their software tries to achieve. Essentially, it’s an Eclipse plugin which gives you recommendations. The time was too short for going into the details of how they actually do it, I suppose.

Eleni talked about merging identities. When tapping various data sources, you have to deal with people having different identity domains. You may want to merge the identities belonging to the same person, she said. She gave a few examples of what can go wrong when trying to merge identities. One of them is that some identities do not represent humans but rather bots. Commonly used labels is a problem, she said. She referred to email address prefixes which may very well be the same for different people, think j.wright@apple.com, j.wright@gmail.com, j.wright@amazon.com. They have at least 13 different problems, she said, and the impact of wrongly merging identities can be to either underestimate or overestimate the number of community members. Manual inspection is required, at least so far, she said.

The next two days were then dedicated to FOSDEM which had a Privacy Devroom. There I had a talk on PrivacyScore.org (slides). I had 25 minutes which I was overusing a little bit. I’m not used to these rather short slots. You just warm up talking and then the time is already up. Anyway, we had very interesting discussions afterwards with a few suggestions regarding new tests. For example, someone mentioned that detecting a CDN might be worthwhile given that CloudFlare allegedly terminates 10% of today’s Web traffic.

When sitting with friends we noticed that FOSDEM felt a bit like Christmas for us: Nobody really cares a lot about Christmas itself, but rather about the people coming together to spend time with each other. The younger people are excited about the presents (or the talks, in this case), but it’s just a matter of time for that to change.

It’s been an intense yet refreshing weekend and I’m looking very much forward to coming back next time. For some reason it feels really good to see so many people caring about Free Software.

Month: February 2018

Speaking at FOSDEM 2018 in Brussels, Belgium