Both artificial intelligence (AI) and intelligent automation of business processes have reduced content moderation time from 24 hours to 30 seconds.
11 March 2019
#cases #ai #automatization
Reading time: 3 min
AI is one of the main technological trends of 2018, which successfully moved to 2019. Companies boast about how artificial intelligence helps them improve their work, for example, using an intelligent management system.
However, speaking about the «intelligent management system», it is important to remember that IT companies worked in this area even when everyone didn't call it publicly «artificial intelligence». And when AI was not mainstream yet.
From that time, we have a special case, which perfectly shows that the AI, whatever it is called, is simple and effective. And that there is no need to be a giant corporation to use this technology. And it is possible to solve the most difficult tasks in such a non-trivial way, benefiting not only the customer, but also his employees and clients.
So…
A long time ago (in 2013) in a galaxy far far away (in Omsk region)...
(Or acquaintance with the customer)
In this story our customer was very afraid of losing a large sum of money, or even lose the whole business. Do you think he was doing something dangerous? Nothing like that. He just had a free classifieds site.
We will tell you what the real danger was. Users submitted ads through the form on the site in various sections: real estate, cars, vacancies, etc. According to the Russian advertising law, the classifieds site is also seriously responsible for the ads placed on it. If a user has posted on the site an advertisement for the sale of something illegal or simply false information, not only he, but also the site owner will be punished. The penalty for violation of this law may be from 200 to 500 thousand rubles. This is too much money for the Omsk portal in 2013, and today, too.
All ads on the site were checked by moderators. But if the user somehow placed the illegal content and the moderator has not yet noticed it, there was a chance that the site owner would be punished.
Therefore, during our long-term cooperation, it was decided to modernize the moderation system, because there were problems with it.
It may seem that this is not the most important task in the general list, but for the client, this area involved great risks. It was important to set up a moderation system, because it was possible to receive a huge penalty due to posts from customers in various ways.
For example, in the «Real estate» section, a realtor could announce the sale of apartments in a new house without a building permit. If the documents aren't executed correctly, in this case both the advertiser and the site itself can be punished for this ad. Even more dangerous are goods whose circulation is generally regulated by only one document - the Criminal Code of the Russian Federation. Clients of the site tried to place ads for such things.
How did it work?
(More precisely, how it did NOT work)
We did research workflow of moderators on this site and found out the following information. Despite the importance of moderating ads, for the employees who dealt with it, it was not at all the main responsibility, but the side job in spare time.
At the same time, the moderation was paid by the piece: how many ads the employee checked - so much money he received. This did not lead to anything good.
We identified a lot of problems in the moderation system:
Due to the fact that moderation is not the main work of an employees, it was impossible to guarantee that the "fresh" ad would be checked immediately. It could wait its turn on average 24 hours. Of course, with such a time limit, the pre-moderation was unacceptable, all checks took place after the fact. So, within 24 hours the site could be threatened with a penalty or immediate blocking.
Because of the piece-rate payment, moderators focused on how to check more ads in a short time in order to make better money - this is obvious (and normal - people want to make money). Therefore, no one wanted to take the most complex and problematic ads, they remained at the end of the moderation queue and stayed there for a long time.
Users also added problems with moderation. By default, ads on the main page of the site were placed in the order of their last modification. The first to publish are those that contain the most recent changes. Customers quickly guessed that you can open your old ad, change something there, for example, put an extra space. Then the modification time will change and the ad will be published again on the first page of the site, which means more visitors will see it. As a result, confused moderators had to check the same ads over and over again, not realizing that they had not changed at all.
Numbers
5 Moderators
50 000+ Ads per week
24 hours Average moderation time for 1 ad
"Heeeelp!"
(Or setting the task)
Here we really wondered what to write: the moderation system was so terrible that it seemed as if the customer was just confused and calling for help, raising his hands to the sky.
Because it is simply impossible to work when illegal advertising can remain on the site for 24 hours, and the site can be blocked for it. And blocking immediately deprives the site owner of income. It is not known how quickly it will be possible to restore the portal operability after this. And this is not the only problem.
Almighty algorithm
(Or solving the problem)
First of all, it was necessary to reduce the number of ads that come to re-moderation. Therefore, the first solution was simply to add a button, allowing everyone to raise the ads on the page without changes (and sending to re-moderation, respectively). The "Raise ad" button has become very popular among the advertisers.
Secondly, we decided that the moderators should not choose the ads they want to check. And we created an algorithm that made the decision for them. Now it can be called "artificial intelligence". But we prefer to use the concept of an algorithm or simply "queue".
An automatic "queue" determined to which moderator send the ad. Temporary standards were set, and if within a certain period a moderator could not check the ad, it was returned to the beginning of the "queue". The system made sure that the average moderation time was as small as possible. If the moderator believed that he would do the job, but he needed more time, it was necessary to confirm continuation of work.
In the third step, we determined that there are some actions most often performed by the authors of the ads. It may be called a set of typical edits.
One of the most common edits is a price change without changing the text of the ad. This edit cannot affect the compliance with the advertising law, so we gave users the opportunity to make such changes without re-moderation. So, we did with a number of attributes that logically did not require moderation.
Fourth. In the fourth step, we added a system for parsing text and highlighting certain information in it. We did it this way: we taught the system to analyze the ads published by users. On the one hand, it highlights useful information — for example, addresses and other details. On the other hand, it checks keywords that indicate that this is an illegal ad.
If the system could recognize the ad and did not find anything suspicious, then it was not sent for moderation, but automatically published. If something was found or was not recognized, then the ad was sent to the moderator. Unrecognized ads were used for further training of our AI.
Here elements of artificial intelligence with machine learning are used. As a result, our AI successfully found in the new ads attempts to deceive the system, for example, when the header contained allowed information, but text of ad included something illegal.
The fifth. We deliberately did not teach our AI to recognize photos. As a result, the main job of the moderator was just to view the photos.
Why we did not dare to train the system to recognize the pictures? Because users were so tricky with photos, placing the goods prohibited for sale on them, that even human intelligence was hard to track it.
The sixth. With the help of "highlighting" we showed the moderators photos and text that were changed in the ad, if it came for a re-check. Moderators no longer needed to view the entire announcement as a whole, it was enough to pay attention to the highlighted fragments.
Seventh. All functions have now been implemented in the form of a web application. Previously, this system had only a desktop version. Therefore, the moderators had to install this application on their computer. Now the application was on the web, so the employee could anytime, anywhere check ads.
Moderation in 30 seconds
(Or the results of our work)
As a result, the term of moderation of one ad was reduced by 2880 times and ceased to be 24 hours, and became 30 seconds. Of course, in this case, ads began to come to pre-moderation. We minimized the risk that users can see a "dangerous" ad.
The number of moderators decreased by 80%. In other words, after modernization instead of 5 employees, one moderator could successfully do all the work. The reason is simple: we have automatized most of the work.
In this case, we were able to make extensive use of the natural language processing system (Natural Language Processing, NLP) - this is the general direction of artificial intelligence and mathematical linguistics.
For artificial intelligence, analysis means understanding the language, in particular, we solved the problem of extracting facts from the text. It is important that initially there were many inaccuracies and exceptions in the work of the system, so we provided it with an additional training process.
This case is connected with the discussion about reasons, why AI cannot replace a person: understanding what users do, what task they are trying to solve is unlikely to be fully available to artificial intelligence.
In addition, in the course of solving problems, we implemented a pleasant functionality for users - the "Raise ad" button. Today, this is a very common decision, but five or six years ago everything was different. Thus, in the course of improving the internal workflow of the company, we were able to benefit its customers.
Summary
It seems that in this case our solutions were directed only "inside" the company's working processes - to improve them and in order to reduce the risks for the site to be fined for placing unfair advertising.
But according to the results, everyone benefited from the solution of these tasks: the site put things in order, the moderators' work became more comfortable, the company reduced the risks, and the users, firstly, got a pleasant functionality in the form of the "Raise ad" button, and secondly, their ads are now fast moderated.