In our last blog post of this tutorial series, we have successfully created a simple page where a user can leave a comment. However, in our scenario all new comments have to be reviewed before they are published. Instead of moderating these comments ourselves, we will use the Microworkers.com Workforce to instead do this for us. As different workers may have different opinions on whether to approve or decline a comment, we will use a majority vote system. In detail every comment will be reviewed by 3 workers, and once all reviews are in, our application will decide if the comment can be published. So in this post we will show you how to group all these comments into one campaign, sending it out to the Microworkers.com Beta API and finally starting it.
You can try out the completely finished Comment Moderation System at commentmoderation.demo.microwrokers.com. You can also download the end result of this part at commentmoderation.demo.microwrokers.com/Downloads/Step2.zip. We recommend that you download this Zip-File first if you want to follow along as we will only mention the most important steps in this tutorial.
In order to use the Microworkers API we will need a suitable RESTClient as well as an API Key for authentication. You can find your API Key at the bottom of “My Data” -> “Account” in the backend of the beta system. Luckily there is also a finished RESTClient for us to use, which we will simply download and use as library in our project. A complete documentation of the API can be found here.
The work flow of creating a campaign out of all new comments can be described as follows:
- Fetch all newly created comments by their status (0) from the database
- Create a CSV file
- For every comment and for every review request create an entry in the CSV file with the comment in the first column and the comment ID in the second column. So if you have 3 new comments and want 5 reviews before deciding if the comment can be approved you will end up with 3×5=15 entries in the CSV file. The second column holding the comment ID will be used later to map all reviews to the specific comment.
- Update the status to 1 (pending review) of the comments so they will not be picked up again by the campaign creation process
- Create a Dynamic Data Campaign passing along the CSV file.
First of all we will update our project structure as follows:
The newly created “CSVFiles” folder will store all the CSV files our Comment Moderation System will create over time. Furthermore we created a Folder called “RESTClient” which holds the provided RESTClient library. The script createCampaign.php will do all the work and has to be called in regular time intervals to pick up new comments again and again.
Now let’s extend our settings.php file with the additional variables we will need. Most importantly we will define a variable holding the API Key and a variable for the number of reviews we want before taking further actions (lines 25-29). In addition, we set the category and sub category ID of our campaign. The API provides appropriate calls to fetch these IDs on the fly, but as we do not want to query the API more than we have to we look up these IDs in advance and set these variables manually (lines 31 to 35). We also define the payment per task. Finally, we define the URLs of the API’s function points for convenience (lines 41 to 47).
CreateCampaign.php:
As usual we start off by including our settings file. In addition we also include the RESTClient library (line 7). Then we proceed establishing a database connection as seen in Part I of this tutorial series. As a security precaution we only allow the localhost to run this script (line 20). Then we start by checking if there are any new comments in the system by counting all the comments with the appropriate status (lines 24 to 26). In case there are new comments, we create a file name for the CSV File which includes the current timestamp of creation. Thus, each file will have a unique name and finally we create this file in our “CSVFiles” folder (lines 30 to 36). Now we loop over all comments which have been fetched from the database (line 38). In this loop we create another loop which will be executed as many times as we have set the “numOfReviews” variable in the settings.php file to (line 40). This inner loop will now create a single row containing the comment in the first column and the comment id in the second. Then we write the row to the CSV file (lines 41 to 44). Furthermore, we update the status of the comment to pending review so it will not be picked up again on the second execution of this script (lines 46 to 47).
With the CSV file completely propagated we can start initializing the RESTClient. For this we have to create a new instance, set our API Key, the URL of the function point we want to call and the according method to use. As defined by the REST standard new objects are created when using “POST” on the object’s root path. Thus we set the method to “POST” add the URL according to the root path of the Dynamic Data Campaign (lines 55 to 59).
In lines 60 to 88 we create the campaigns settings using the format as described in the API Documentation. In the detailedDescription field we use the “$mw_csvColumn1” tag, which will be replaced with the content of the first column of our CSV file when displaying the task description to the worker. As we cannot predict how long the comment is, we set the time needed to complete the task to generous 5 minutes. When choosing this time constraint, please take into account that the worker needs time to get familiar with the task and that the internet may be far slower in different countries than in your own. This is an upper limit, so if the worker is faster he can submit before he has reached the time limit, thus not slowing the campaign speed. As we only allow one worker per task we set the tracking method “workerTrackIPBased” to 0. This means that the system will track task submissions on an account basis instead of using the IP Address. We continue by setting the proofs we ask from the worker. In our scenario, one proof is sufficient in which the worker should either insert “y” if he approves or “n” if he declines the comment. We also make use of a new feature the so called “Proof Validations”. In our case we use a regular expression which only matches “n” or “y”. In case the worker provides a different format, his submission will be flagged accordingly and we can make use this flag later on in the rating process. Finally we set the CSV file. Here we have to prefix the “@” symbol and use the absolute path to the file so that the RESTClient can process the request properly.
Now that the campaign settings are complete we execute the call to the API (line 100). As response we will receive a JSON object with information about the success and in this case the “campaign_id” if it has been created successfully. We finish our script by automatically starting the campaign by using the the “campaign_id” from our last response and thus configure the RESTClient accordingly. As starting a campaign is basically an update we use “PUT” as method and the Dynamic Data Campaign’s root path concatenated with the campaign id as URL (lines 104 to 116). All that is left to do is to execute the call.
Puh – this post got quite lengthy as there is a lot of explanatory text. But luckily there was only little to code ;). We are now done with this part of the series and you should be able to automatically create campaigns with our newly created script. So give it a try and confirm in the user backend that your campaign was created successfully.
On the next part we will cover the automated rating and decision making process of our Comment Moderating System. This will be the last part of our tutorial series and we hope we can count on you.