The ODRS is the service that produces ratings and reviews for gnome-software. I built the service a few years ago, and it’s been dutifully trucking on ever since. There are over 25,000 reviews, 50k votes, and over 4k different applications reviewed. Over half a million clients get application reviews every single day.
Recently it’s been showing signs of needing work, and so I’ve spent a few days converting it to Python 3, then to SQLAlchemy, and then fixing all the broken stuff that we’ve lived with for a while (e.g. no emoji support because we were not using utf8mb4…). Part of the new work will be making it easier to flag and then moderate reviews, and that needs your help. Although any unauthenticated user can report a review for any reason, some reviews should be automatically marked at submission if they contain known bad words. There is almost no reason to write a review in locale en_GB
and use the word fuck and so I think marking that review as needing moderation before it’s shown to thousands of people is a sensible thing to do.
To this to work, I can’t just use a blacklist of words as some words are only really vulgar in some regions, and some are perfectly valid words in other languages. For this reason I need the blacklist to be keyed to the submitted locale.
This is where I need your help. If you can spare 2 minutes, and know a lot of dirty words in your language can you please add them to this spreadsheet. Much appreciated.
You have no idea how long I waited for an opportunity to shine with my knowledge of swear words. :-D
Joke aside, regarding the wording: The list is used directly as blacklist? So would it be good to conjugate some verbs?
Like “Ich finde dieses Programm absolut beschissen!” (word by word “I find this software absolute shitty.”) So adding “Scheiße” won’t work. I would also have to add “beschissen”.
¯\_(ツ)_/¯ — do your best dude.
Nut just verbs, other words as well, such as plural and definite form, which are more complicated in Swedish than in English, and adjectives are conjugated along with the noun they’re describing as well.
¯\_(ツ)_/¯ — do your best dude. This is really just a first line of defense. Users can report offensive reviews still.
And of course, people will use some English words when writing in Swedish.
Do you think that’s true for all languages?
Please note that many non-english speakers (e.g. me) have a tendency to submit reviews in english, even though they’re running a non-english locale desktop. So instead of judging from their desktop locale, you should allow the user to specify which language the feedback is in (Steam does this well). That also helps further down the road in showing the right feedback to the right audience.
Is there a GNOME Software issue open about this? It would be easy to fix, but would need design input from Allan.
That could become interesting, when the people are rating the new client application for PornHub ;)
As I read through these words, I’ve seen several among them which are highly context dependent and cannot be blocked in general. Words like “nazi” is not acceptable to say as a swear word to parents or friends but if there is an app that propagates ultra fascist content, it would be. I think it hard to encapsulate this in a spreadsheet where everyone wants to shine with their knowledge of swear words
Also, while “nazi” might be inappropriate in certain contexts (though I don’t consider it a swear word myself), terms like “grammar nazi” might be appropriate.