Skip navigation.
Home
Semantic Software Lab
Concordia University
Montréal, Canada

Feed aggregator

GATE and JSON: Now Supporting 280 Character Tweets!

We first added support for reading tweets stored as JSON objects to GATE in version 8, all the way back in 2014. This support has proved exceptionally useful both internally to help our own research but also to the many researchers outside of Sheffield who use GATE for analysing Twitter posts. Recent changes that Twitter have made to the way they represent Tweets as JSON objects and the move to 280 character tweets has led us to re-develop our support for Twitter JSON and to also develop a simpler JSON format for storing general text documents and annotations.

This work has resulted in two new (or re-developed plugins); Format: JSON and Format Twitter. Both are currently at version 8.6-SNAPSHOT and are offered in the default plugin list to users of GATE 8.6-SNAPSHOT.

The Format: JSON plugin contains both a document format and export support for a simple JSON document format inspired by the original Twitter JSON format. Essentially each document is stored as a JSON object with two properties text and entities. The text field is simply the text of the document, while the entities contains the annotations and their features. The format of this field is that same as that used by Twitter to store entities, namely a map from annotation type to an array of objects each of which contains the offsets of the annotation and any other features. You can load documents using this format by specifying text/json as the mime type. If your JSON documents don't quite match this format you can still extract the text from them by specifying the path through the JSON to the text element as a dot separated string as a parameter to the mime type. For example, assume the text in your document was in a field called text but this wasn't at the root of the JSON document but inside an object named document, then you would load this by specifying the mime type text/json;text-path=document.text. When saved the text and any annotations would, however, by stored at the top level. This format essentially mirrors the original Twitter JSON, but we will now be freezing this format as a general JSON format for GATE (i.e. it won't change if/when Twitter changes the way they store Tweets as JSON).

As stated earlier the new version of our Format: Twitter plugin now fully supports Twitters new JSON format. This means we can correctly handle not only 280 character tweets but also quoted tweets. Essentially a single JSON object may now contain multiple tweets in a nested hierarchy. For example, you could have a retweet of a tweet which itself quotes another tweet. This is represented as three separate tweets in a single JSON object. Each top level tweet is loaded into a GATE document and covered with a Tweet annotation. Each of the tweets it contains are then added to the document and covered with a TweetSegment annotation. Each TweetSegment annotation has three features textPath, entitiesPath, and tweetType. The latter of these tells you the type of tweet i.e. retweet, quoted etc. whereas the first two give the dotted path through the JSON object to the fields from which text and entities were extracted to produce that segment. All the JSON data is added as nested features on the top level Tweet annotation. To use this format make sure to use the mime type text/x-json-twitter when loading documents into GATE.


So far we've only talked about loading single JSON objects as documents, however, usually you end up with a single file containing many JSON objects (often one per line) which you want to use to populate a corpus. For this use case we've added a new JSON corpus populator.


This populator allows you to select the JSON file you want to load, set the mime type to use to process each object within the file, and optionally provide a path to a field in the object that should be used to set the document name. In this example I'm loading Tweets so I've specified /id_str so that the name of the document is the ID of the tweet; paths are / separated list of fields specifying the root to the relevant field and must start with a /.

The code for both plugins is still under active development (hence the -SNAPSHOT version number) while we improve error handling etc. so if you spot any issues or have suggestions for features we should add please do let us know. You can use the relevant issue trackers on GitHub for either the JSON or Twitter format plugins.
Categories: Blogroll

A Real World Reinforcement Learning Research Program

Machine Learning Blog - Fri, 2018-07-06 11:10

We are hiring for reinforcement learning related research at all levels and all MSR labs. If you are interested, apply, talk to me at COLT or ICML, or email me.

More generally though, I wanted to lay out a philosophy of research which differs from (and plausibly improves on) the current prevailing mode.

Deepmind and OpenAI have popularized an empirical approach where researchers modify algorithms and test them against simulated environments, including in self-play. They’ve achieved significant success in these simulated environments, greatly expanding the reportoire of ‘games solved by reinforcement learning’ which consisted of the singleton backgammon when I was a graduate student. Given the ambitious goals of these organizations, the more general plan seems to be “first solve games, then solve real problems”. There are some weaknesses to this approach, which I want to lay out next.

  • Broken API One issue with this is that multi-step reinforcement learning is a broken API in the sense that it creates an interface for problem definitions that is unsolvable via currently popular algorithm families. In particular, you can create problems which are either ‘antishaped’ so local rewards mislead w.r.t. long term rewards or keylock problems, as are common in Markov Decision Process lower bounds. I coded up simple versions of these problems a couple years ago and stuck them on github now to be extra crisp. If you try to apply policy gradient or Q-learning style algorithms on these problems they commonly run into exponential (in the number of states) sample complexity. As a general principle, APIs which create exponential sample complexity are bad—they imply that individual applications require taking advantage of special structure in order to succeed.
  • Transference Another significant issue is the degree of transference between solutions in simulation and the real world. “Transference” here potentially happens at several levels.
    • Do the algorithms carry over? One of the persistent issues with simulation-based approaches is that you don’t care about sample complexity that much—optimal performance at acceptable computational complexities is the typical goal. In real world applications, this is somewhat absurd—you really care about immediately doing something reasonable and optimizing from there.
    • Do the simulators carry over? For every simulator, there is a fidelity question which comes into play when you want to transfer a policy learned in the simulator into action in the real world. Real-time ray tracing and simulator quality more generally are advancing, but I’m not ready yet to trust a self-driving care trained in a simulated reality. An accurate simulation of the physics is unclear—friction for example is known-difficult, and more generally the representative variety of exogenous events in an open world seems quite difficult to implement.
  • Solution generality When you test and discover that an algorithm works in a simulated world, you know that it works in the simulated world. If you try it in 30 simulated worlds and it works in all of them, it can still easily be the case that an algorithm fails on the 31st simulated world. How can you achieve confidence beyond the number of simulated worlds that you try and succeed on? There is some sense by which you can imagine generalization over an underlying process generating problems, but this seems like a shaky justification in practice, since the nature of the problems encountered seems to be a nonstationary development of an unknown future.
  • Value creation Solutions of a ‘first A, then B’ flavor naturally take time to get to the end state where most of the real value is set to be realized. In the years before reaching applications in the real world, does the funding run out? We certainly hope not for the field of research but a danger does exist. Some discussion here including the comments is relevant.

What’s an alternative?

Each of the issues above is addressable.

  • Build fundamental theories of what are statistically and computationally tractable sub-problems of Reinforcement Learning. These tractable sub-problems form the ‘APIs’ of systems for solving these problems. Examples of this include simpler (Contextual Bandits), intermediate (learning to search, and move advanced (Contextual Decision Process).
  • Work on real-world problems. The obvious antidote to simulation is reality, driving both the need to create systems that work in reality as well as a research agenda around reality-centered issues like performance at low sample complexity. There are some significant difficulties with this—reinforcement style algorithms require interactive access to learn which often drives research towards companies with an infrastructure. Nevertheless, offline evaluation on real-world data does exist and the choice of emphasis in research directions is universal.
  • The combination of fundamental theories and a platform which distills learnings so they are not forgotten and always improved upon provides a stronger basis for expectation of generalization into the next problem.
  • The shortest path to creating valuable applications in the real world is to simply work on creating valuable applications in the real world. Doing this in a manner guided by other elements of the research program is just good sense.

The above must be applied in moderation—some emphasis on theory, some emphasis on real world applications, some emphasis on platforms, and some emphasis on empirics. This has been my research approach for a little over 10 years, ever since I started working on contextual bandits.

Let’s call the first research program ’empirical simulation’ and the second research program ‘real fundamentals’. The empirical simulation approach has a clear strong advantage in that it creates impressive demos, which creates funding, which creates more research. The threshold for contribution to the empirical simulation approach may also be lower simply because it requires mastery of fewer elements, implying people can more easily participate in it. At the same time, the real fundamentals approach has clear advantages in addressing the weaknesses of the empirical simulation approach. At a concrete level, this means we have managed to define and create fundamentals through research while creating real-world applications and value radically more efficiently than the empirical simulation approach has achieved.

The ‘real fundamentals’ concept is behind the open positions above. These positions have been designed to come with both the colleagues and mandate to address the most difficult research problems along with the organizational leverage to change the world. For people interested in fundamentals and making things happen in the real world these are prime positions—please consider joining us.

Categories: Blogroll

11th GATE Training Course: Large Scale Text and Social Media Analytics with GATE

Every year for the last decade, the GATE team at Sheffield have been delivering summer courses helping people get to grips with GATE technology. One year we even ran a second course in Montreal! It's always a challenge deciding what to include. GATE has been around for almost a quarter of a century, and in that time it has organically grown to include a wide variety of technologies too numerous to cover in a week long course, and adapt to the changing needs of our users during one of the most technologically exciting periods in history. But under the capable leadership of Diana Maynard and Kalina Bontcheva, we've learned to squeeze the most useful material into the limited time available, helping beginners to get started with GATE without overwhelming them, as well as empowering more experienced users to see the potential to push it into new territory.

Recent years have seen a surge of interest in social media. These media offer potential for commercial users to deepen their understanding of their customers, and for researchers to explore and understand the ways in which these media are affecting society, as well as using social media data for various other research purposes. For this reason, we have positioned social media as a central theme for the course, which most students seem to find accessible and interesting. It provides an opportunity to showcase GATE's Twitter support, and draw examples from our own work on social media within the Societal Debates theme of SoBigData. However, there are also plenty of examples illustrating how GATE can be applied to other popular areas, such as analysis of news or medical text.

I've been teaching GATE's machine learning offering for most of the time the course has been running, and therefore I've had the opportunity to explore different ways of helping people to get a handle on what can seem an intimidating topic to those who aren't already familiar with it. Machine learning is challenging to teach to a mixed audience, because it's such a large field and the time is limited. It's also an important one though, as it's increasingly a part of the public discourse, and many students are excited to learn about the ways they can incorporate machine learning into their work using GATE. Johann Petrak has taken the lead on keeping the GATE Learning Framework up to date with the latest developments in this rapidly evolving field, and I'm always proud and excited to teach something new that's been added since the last course.

It's evident from the discussions during lunch and tea breaks that students are eager to talk to us about how they are using GATE, and how they would like to use it. I think one of the most valuable things about the course is the opportunity it provides for the students to talk to us about what they are doing with GATE, and for us to be inspired by the range of uses to which GATE is being put. Here is some of the feedback we received from students this year:

"Last week was one of the most useful courses I have done. Overall I think it was pitched really well given the range of technical abilities."

"Thank you all for such an informative and well-delivered course. I was a little worried about whether I'd be able to pick it up as I don’t have a background in programming, but I learned so much and the trainers were all very helpful and patient."

Categories: Blogroll

When the bubble bursts…

Machine Learning Blog - Mon, 2018-06-04 16:39

Consider the following facts:

  1. NIPS submission are up 50% this year to ~4800 papers.
  2. There is significant evidence that the process of reviewing papers in machine learning is creaking under several years of exponentiating growth.
  3. Public figures often overclaim the state of AI.
  4. Money rains from the sky on ambitious startups with a good story.
  5. Apparently, we now even have a fake conference website (https://nips.cc/ is the real one for NIPS).

We are clearly not in a steady-state situation. Is this a bubble or a revolution? The answer surely includes a bit of revolution—the fields of vision and speech recognition have been turned over by great empirical successes created by deep neural architectures and more generally machine learning has found plentiful real-world uses.

At the same time, I find it hard to believe that we aren’t living in a bubble. There was an AI bubble in the 1980s (before my time), a techbubble around 2000, and we seem to have a combined AI/tech bubble going on right now. This is great in some ways—many companies are handing out professional sports scale signing bonuses to researchers. It’s a little worrisome in other ways—can the field effectively handle the stress of the influx?

It’s always hard to say when and how a bubble bursts. It might happen today or in several years and it may be a coordinated failure or a series of uncoordinated failures.

As a field, we should consider the coordinated failure case a little bit. What fraction of the field is currently at companies or in units at companies which are very expensive without yet justifying that expense? It’s no longer a small fraction so there is a chance for something traumatic for both the people and field when/where there is a sudden cut-off. My experience is that cuts typically happen quite quickly when they come.

As an individual researcher, consider this an invitation to awareness and a small amount of caution. I’d like everyone to be fully aware that we are in a bit of a bubble right now and consider it in their decisions. Caution should not be overdone—I’d gladly repeat the experience of going to Yahoo! Research even knowing how it ended. There are two natural elements here:

  1. Where do you work as a researcher? The best place to be when a bubble bursts is on the sidelines.
    1. Is it in the middle of a costly venture? Companies are not good places for this in the long term whether a startup or a business unit. Being a researcher at a place desperately trying to figure out how to make research valuable doesn’t sound pleasant.
    2. Is it in the middle of a clearly valuable venture? That could be a good place. If you are interested we are hiring.
    3. Is it in academia? Academia has a real claim to stability over time, but at the same time opportunity may be lost. I’ve greatly enjoyed and benefited from the opportunity to work with highly capable colleagues on the most difficult problems. Assembling the capability to do that in an academic setting seems difficult since the typical maximum scale of research in academia is a professor+students.
  2. What do you work on as a researcher? Some approaches are more “bubbly” than others—they might look good, but do they really provide value?
    1. Are you working on intelligence imitation or intelligence creation? Intelligence creation ends up being more valuable in the long term.
    2. Are you solving synthetic or real-world problems? If you are solving real-world problems, you are almost certainly creating value. Synthetic problems can lead to real-world solutions, but the path is often fraught with unforeseen difficulties.
    3. Are you working on a solution to one problem or many problems? A wide applicability for foundational solutions clearly helps when a bubble bursts.

Researchers have a great ability to survive a bubble bursting—a built up public record of their accomplishments. If you are in a good environment doing valuable things and that environment happens to implode one day the strength of your publications is an immense aid in landing on your feet.

Categories: Blogroll

Funded PhD Opportunity: Large Scale Analysis of Online Disinformation in Political Debates

Applications are invited for an EPSRC-funded studentship at The University of Sheffield commencing on 1 October 2018.
The PhD project will examine the intersection of online political debates and misinformation, through big data analysis. This research is very timely, because online mis- and disinformation is reinforcing the formation of polarised partisan camps, sharing biased, self-reinforcing content. This is coupled with the rise in post-truth politics, where key arguments are repeated continuously, even when proven untrue by journalists or independent experts. Journalists and media have tried to counter this through fact-checking initiatives, but these are currently mostly manual, and thus not scalable to big data.

The aim is to develop machine learning-based methods for large-scale analysis of online misinformation and its role in political debates on online social platforms.



Application deadline: as soon as possible, until the funding is filled  
Interviews: interviews take place within 2-3 weeks of application

Supervisory team: Professor Kalina Bontcheva (Department of Computer Science, University of Sheffield), Professor Piers Robinson (Department of Journalism, University of Sheffield), and Dr. Nikolaos Aletras (Information School, University of Sheffield).


Award DetailsThe studentship will cover tuition fees at the EU/UK rate and provide an annual maintenance stipend at standard Research Council rates (£14,777 in 2018/19) for 3.5 years.
EligibilityThe general eligibility requirements are:
  • Applicants should normally have studied in a relevant field to a very good standard at MSc level or equivalent experience.
  • Applicants should also have a 2.1 in a BSc degree, or equivalent qualification, in a related discipline.
  • ESRPC studentships are only available to students from the UK or European Union. Applications cannot be accepted from students liable to pay fees at the Overseas rate. Normally UK students will be eligible for a full award which pays fees and a maintenance grant if they meet the residency criteria and EU students will be eligible for a fees-only award, unless they have been resident in the UK for 3 years immediately prior to taking up the award.
How to applyTo apply for the studentship, applicants need to apply directly to the University of Sheffield for entrance into the doctoral programme in Computer Science 


  • Complete an application for admission to the standard computer science PhD programme http://www.sheffield.ac.uk/postgraduate/research/apply 
  • Applications should include a research proposal; CV; academic writing sample; transcripts and two references.
  • The research proposal of up to 1,000 words should outline your reasons for applying to this project and how you would approach the research including details of your skills and experience in both computing and/or data journalism.
  • Supporting documents should be uploaded to your application.
Categories: Blogroll

Reinforcement Learning Platforms

Machine Learning Blog - Mon, 2018-04-16 16:17

If you are interested in building an industrial Reinforcement Learning platform, we are hiring a data scientist and multiple developers as a followup to last year’s hiring. Please apply if interested as this is a real chance to be a part of building the future

Categories: Blogroll

Discerning Truth in the Age of Ubiquitous Disinformation (5): Impact of Russia-linked Misinformation vs Impact of False Claims Made By Politicians During the Referendum Campaign

Discerning Truth in the Age of Ubiquitous Disinformation (5)Impact of Russia-linked Misinformation vs Impact of False Claims Made By Politicians During the Referendum Campaign
Kalina Bontcheva (@kbontcheva)


My previous post focuses mainly on the impact of misinformation from Russian Twitter accounts.  However it is important to also acknowledge the impact of false claims made by politicians which were shared and distributed through social media.

A House of Commons Treasury Committee Report published on May 2016, states that: “The public debate is being poorly served by inconsistent, unqualified and, in some cases, misleading claims and counter-claims. Members of both the ‘leave’ and ‘remain’ camps are making such claims. Another aim of this report is to assess the accuracy of some of these claims..”

In our research, we analysed the number of Twitter posts around some of the these disputed claims, firstly to understand their resonance with voters, and secondly, to compare this to the volume of Russia-related tweets discussed above.

A study  of the news coverage of the EU Referendum campaign established that the economy was the most covered issue, and in particular, the Remain claim that Brexit would cost households £4,300 per year by 2030 and the Leave campaign’s claim that the EU cost the UK £350 million each week. Therefore, we focused on  these two key claims and analysed tweets about them.

With respect to the disputed £4,300 claim (made by the Chancellor of the Exchequer), we  identified 2,404 posts in our dataset (tweets, retweets, replies), referring to this claim.

For the £350 million a week disputed claim - there are 32,755 pre-referendum posts (tweets, retweets, replies) in our dataset. This is 4.6 times the 7,103 posts related to Russia Today and Sputnik and 10.2 times more than the 3,200 tweets by the Russia-linked accounts suspended by Twitter.

In particular, there are more than 1,500 tweets from different voters, with one of these wordings:

I am with @Vote_leave because we should stop sending £350 million per week to Brussels, and spend our money on our NHS instead.

I just voted to leave the EU by postal vote! Stop sending our tax money to Europe, spend it on the NHS instead! #VoteLeave #EUreferendum

Many of those tweets have themselves received over a hundred likes and retweets each.

This false claim is being regarded by media as one of the key ones behind the success of VoteLeave.

So returning to Q27 on likely impact of misinformation on voting behaviour - it was not possible for us to quantify this from such tweets alone. A potentially useful indicator comes from an Ipsos Mori poll published on 22 Jun 2016, which  showed that for 9% of respondents the NHS was the most important issue in the campaign.


In conclusion, while it is important to quantify the potential impact of Russian misinformation, we should also consider the much wider range of misinformation that was posted on Twitter and Facebook during the referendum and its likely overall impact.

We should also study not just fake news sites and the social platforms that were used to disseminate misinformation, but also the role and impact of Facebook-based algorithms for micro-targeting adverts, that have been developed by private third parties.

A related question, is studying the role played by hyperpartisan and mainstream media sites during the referendum campaign. This is the subject of our latest study, with key findings available here
.
High Automation Accounts in Our Brexit Tweet Dataset

While it is hard to quantify all different kinds of fake accounts, we know already that a study by City University identified 13,493 suspected bot accounts, amongst which Twitter found only 1% as being linked to Russia. In our referendum tweet dataset there are tweets by 1,808,031 users in total, which makes the City bot accounts only 0.74% of the total.

If we consider in particular, Twitter accounts that have posted more than 50 times a day (considered high automation accounts by researchers), then there are only 457 such users in the month leading up to the referendum on 3 June 2016.

The most prolific were "ivoteleave" and "ivotestay", both suspended, which were similar in usage pattern. There were also a lot of accounts that did not really seem to post much about Brexit but were using the hashtags in order to get attention for commercial reasons.

We also analysed the leaning of these 457 high automation accounts an identified 361 as pro-leave (with 1,048,919 tweets), 39 pro-remain (156,331 tweets), and the remaining 57 as undecided.

I covered how we can address the “fake news” problem in me previous blog post (link) but in summary we need to promote fact checking efforts, and fund open-source research on automatic methods for disinformation detection.

Disclaimer: All views are my own.

Categories: Blogroll

Discerning Truth in the Age of Ubiquitous Disinformation (4): Russian Involvement in the Referendum and the Impact of Social Media Misinformation on Voting Behaviour

Discerning Truth in the Age of Ubiquitous Disinformation (4)Russian Involvement in the Referendum and the Impact of Social Media Misinformation on Voting Behaviour
Kalina Bontcheva (@kbontcheva)


In my previous blog posts I wrote about the 4Ps of the modern disinformation age: post-truth politics, online propaganda, polarised crowds,  and partisan media; and how we can combat online disinformation


The news is currently full of reports of Russian involvement in the referendum and the potential impact of social media misinformation on voting behaviour

A small scale experiment by the Guardian exposed 10 US voters (five on each side) to  alternative Facebook news feeds. Only one participant changed his mind as to how they would vote. Some found their confirmation bias too hard to overcome, while others became acutely aware of being the target of abuse, racism, and misogyny.  A few started empathising with voters holding opposing views. They also gained awareness of the fact that opposing views abound on Facebook, but the platform is filtering them out. 

Russian Involvement in the Referendum
We analysed the accounts that were identified by Twitter as being associated with Russia in front of the US Congress in the fall of 2017, and we also took the other 45 ones that we found with BuzzFeed. We looked at tweets posted by these accounts one month before the referendum, and we did not find an awful lot of activity when compared to the overall number of tweets on the referendum, i.e. both the Russia-linked ads and Twitter accounts did not have major influence. 

There were 3,200 tweets in our data sets coming from those accounts, and 800 of those—about 26%—came from the new 45 accounts that we identified. However, one important aspect that has to be mentioned is that those 45 new accounts were tweeting in German, so even though they are there, the likely impact of those 800 tweets on the British voter is, I would say, not very likely to have been significant.

The accounts that tweeted on 23 Jun were quite different from those that tweeted before or after, with virtually all tweets posted in German. Their behaviour is also very different - with mostly retweets on referendum day by a tight network of anti-Merkel accounts, often within seconds of each other. The findings are in line with those of Prof. Cram from the University of Edinburgh, as reported in the Guardian

Journalists from BuzzFeed UK and our Sheffield  team  used the re-tweet  network to identify another 45 suspicious accounts, subsequently suspended by Twitter. Amongst the 3,200 total tweets, 830 came from the 45 newly identified accounts (26%).  Similar to those identified by Twitter, the newly discovered accounts were largely ineffective in skewing public debate. They attracted very few likes and retweets – the most successful message in the sample got just 15 retweets.

An important distinction that needs to be made is between Russia-influenced accounts that used advertising on one hand, and the Russia-related bots found by Twitter and other researchers on the other. 

The Twitter sockpuppet/bot accounts generally pretended to be authentic people (mostly American, some German) and would not resort to advertising, but instead try to go viral or gain prominence through interactions. An example of one such successful account/cyborg is Jenn_Abrams. Here are some details on how the account duped mainstream media:

http://amp.thedailybeast.com/jenna-abrams-russias-clown-troll-princess-duped-the-mainstream-media-and-the-world 

“and illustrates how Russian talking points can seep into American mainstream media without even a single dollar spent on advertising.”

https://www.theguardian.com/technology/shortcuts/2017/nov/03/jenna-abrams-the-trump-loving-twitter-star-who-never-really-existed 

http://money.cnn.com/2017/11/17/media/new-jenna-abrams-account-twitter-russia/index.html 

A related question is the influence of Russia-sponsored media and its Twitter posts. Here we consider the Russia Today promoted tweets - the 3 pre-referendum ones attracted just 53 likes and 52 retweets between them.

We analysed all tweets posted one month before 23 June 2016, which are either authored by Russia Today or Sputnik, or are retweets of these. This gives an indication of how much activity and engagement there was around these accounts. To put these numbers in context, we also included the equivalent statistics for the two main pro-leave and pro-remain Twitter accounts:



Account Original tweets Retweeted by others Retweets by this account Replies by account Total tweets @RT_com -  General Russia Today 39 2,080 times 62 0 2,181 @RTUKnews 78 2,547 times 28 1 2,654 @SputnikInt 148 1,810 times 3 2 1,963 @SputnikNewsUK 87 206 times 8 4 305 TOTAL 352 6,643 101 7 7,103





@Vote_leave 2,313 231,243 1,399 11 234,966 @StrongerIn 2,462 132,201 910 7 135,580

We also analysed which accounts retweeted RT_com and RTUKnews the most in our dataset. The top one with 75 retweets of Russia Today tweets was a self-declared US-based account that retweets Alex Jones from infowars, RT_com, China Xynhua News, Al Jazeera, and an Iranian news account. This account (still live) joined in Feb 2009 and as of 15 December 2017 has 1.09 million tweets - this means an average of more than 300 tweets per day, indicating it is a highly automated account. It has more than 4k followers, but follows only 33 accounts. Two of the next most active retweeters are a deleted and a suspended account, as well as two accounts that both stopped tweeting on 18 Sep 2016. 

For the two Sputnik accounts, the top retweeter made 65 retweets. It declares itself as Ireland based; has 63.7k tweets and 19.6k likes; many self-authored tweets; last active on 2 May 2017; account created on May 2015; avg 87 tweets a day (which possibly indicates an automated account);. It also retweeted Russia Today 15 times. The next two Sputnik retweeters (61 and 59 retweets respectively) are accounts with high average post-per-day rate (350 and 1,000 respectively) and over 11k and 2k followers respectively. Lastly, four of the top 10 accounts have been suspended or deleted. 



Disclaimer: All views are my own.
Categories: Blogroll
Syndicate content