Skip navigation.
Home
Semantic Software Lab
Concordia University
Montréal, Canada

Blogroll

A Deep Neural Network Sentence Level Classification Method with Context Information

Today we're looking at the work done within the group which was reported in EMNLP2018: "A Deep Neural Network Sentence Level Classification Method with Context Information", authored by Xingyi Song, Johann Petrak and Angus Roberts, all of the University of Sheffield.
Xingyi, S., Petrak, J. & Roberts, A. A Deep Neural Network Sentence Level Classification Method with Context Information. in EMNLP2018 – 2018 Conference on Empirical Methods in Natural Language Processing 00, 0-000 (2018).
Understanding complex bodies of text is a difficult task, especially those in which the context of a statement can greatly influence its meaning. While methods exist that examine the context surrounding a phrase, the authors present a new approach that makes use of much larger contexts than these. This allows for greater confidence in the results of such a method, especially when dealing with complicated subject matter. Medical records are one such area in which complex judgements on appropriate treatments are made across several sentences. It is vital therefore to fully understand the context of each individual statement to be able to collate meaning and accurately understand the sentiment of the entire body of text and the conclusion that should be drawn from it
Although grounded in its use in the medical domain, this new technique can be demonstrated to be more widely applicable. An evaluation of the technique in non-medical domains showed a solid improvement of over six percentage points over its nearest competitor technique despite requiring 33% less training time.This technique examines not only the subject sentence, but also context on either side of it. This embedding is encoded using an adapted FOFE technique that allows for large contexts without crippling amounts of additional computation.
But how does it work? At its core, this novel method analyses not only the target sentence but also an amount of text on either side of it. This context is encoded using an adapted Fixed-size Ordinally Forgetting Encoding (FOFE), turning it from a variable length context into a fixed length embedding. This is processed along with the target, before being concatenated and post-processed to produce an output. 
Experimentation on this new technique was then performed, in comparison to peer techniques. These results showed markedly improved performance compared to LSTM-CNN methods, despite taking almost the same amount of time. The performance of this new Context-LSTM-CNN technique even surpassed an L-LSTM-CNN method despite a substantial reduction in required time. Average test accuracy and training time. Best values are marked as bold, standard deviations in parentheses In conclusion, a new technique is presented, Context-LSTM-CNN, that combines the strength of LSTM and CNN with the lightweight context encoding algorithm, FOFE. The model shows a consistent improvement over either a non-context based model and a LSTM context encoded model, for the sentence classification task.
Categories: Blogroll

ICML 2019: Some Changes and Call for Papers

Machine Learning Blog - Tue, 2018-11-27 22:52

The ICML 2019 Conference will be held from June 10-15 in Long Beach, CA — about a month earlier than last year. To encourage reproducibility as well as high quality submissions, this year we have three major changes in place.

There is an abstract submission deadline on Jan 18, 2019. Only submissions with proper abstracts will be allowed to submit a full paper, and placeholder abstracts will be removed. The full paper submission deadline is Jan 23, 2019.

This year, the author list at the paper submission deadline (Jan 23) is final. No changes will be permitted after this date for accepted papers.

Finally, to foster reproducibility, we highly encourage code submission with papers. Our submission form will have space for two optional supplementary files — a regular supplementary manuscript, and code. Reproducibility of results and easy accessibility of code will be taken into account in the decision-making process.

Our full Call for Papers is available here.

Kamalika Chaudhuri and Ruslan Salakhutdinov
ICML 2019 Program Chairs

Categories: Blogroll

Adapted TextRank for Term Extraction: A Generic Method of Improving Automatic Term Extraction Algorithms

This summer, we presented some of our latest work at SEMANTiCS 2018 in Vienna: "Adapted TextRank for Term Extraction: A Generic Method of Improving Automatic Term Extraction Algorithms".
Zhang, Z., Petrak, J. & Maynard, D. Adapted TextRank for Term Extraction: A Generic Method of Improving Automatic Term Extraction Algorithms. in SEMANTiCS 2018 – 14th International Conference on Semantic Systems 00, 0-000 (2018).

This work has been carried out in the context of the EU KNOWMAK project, where we're developing tools for multi-topic classification of text against an ontology, in order to attempt to map the state of European research output in key technologies.
Automatic Term Extraction (ATE) is a fundamental technique used in computational linguistics for recognising terms in text. Processing the collected terms in a text is a key step in understanding the content of the text.  There are many different ATE methods, but these all tend to work well only in a one specific domain.  In other words, there is no universal method which produces consistently good results, and so we have to choose an appropriate method for the domain being targeted.
In this work, we have developed a novel method for ATE which addresses two major limitations: the fact that no single ATE method consistently performs well across all domains, and the fact that the majority of ATE methods are unsupervised. Our generic method, AdaText, improves the accuracy of existing ATE methods, using existing lexical resources to support them, by revising the TextRank algorithm.After being given a target text, AdaText:
  1. Selects a subset of words based on their semantic relatedness to a set of seed words or phrases relevant to the domain, but not necessarily representative of the terms within the target text. 
  2. It then applies an adapted TextRank algorithm to create a graph for these words, and computes a text-level TextRank score for each selected word. 
  3. Finally, these scores are used to revise the score of a term candidate previously computed by an ATE method. 
This technique was trialled using a variety of parameters (such as the threshold of semantic similarity to select words, as described in step two) over two distinct datasets (GENIA and ACLv2, comprising Medline abstracts and abstracts from ACL respectively). We also tested it with a wide variety of state of the art ATE methods, including modified TFIDF, CValue, Basic, RAKE, Weirdness, LinkProbability, X2, GlossEx and PositiveUnlabeled.



The figures show a sample of performances in different datasets and using different ATE techniques. The base performance of the ATE method is represented by the black horizontal line. The horizontal axis represents the semantic similarity threshold used in step 1. The vertical axis shows average P@K for all five Ks considered.
This new generic combination approach can consistently improve the performance of the ATE method by 25 points, which is a significant increase. However, there is still room for improvement. In future work, we aim to optimise the selection of words from the TextRank graph, work on expanding TextRank to a graph of both words and phrases, and to explore how the size and source of the seed lexicon affects the performance of AdaText.  


Categories: Blogroll

Please vote

Machine Learning Blog - Mon, 2018-10-29 15:45

This is not at all related to Machine Learning.

I lived in Squirrel Hill as a graduate student at Carnegie Mellon so the massacre there is feeling particularly immediate. While the person who did it is obviously culpable, the pattern of events makes it clear that others bear responsibility as well. This pattern includes an attempted bomber of Democrats and Trump critics by a Trump fanboy. It also includes a more general cross section of Republicans and their leaders pushing anti-semitism and more general xenophobia about migrants.

I don’t believe that stochastic terrorism is the goal here. Instead, I have a rather pessimal view of politics in which politicians do pretty much anything to get re-elected, at least in aggregate. Donald Trump’s presidential campaign showed how to do this with a platform of populism, nostalgia, xenophobia, and anti-abortion voters.

The populist angle is looking fairly broken now between anti-populist tax cuts and widely publicized efforts to allow preexisting condition discrimination by insurance companies via Obamacare repeal. About the only populist angle which works is the economy, which is doing fine. On the other hand, there is no obvious change in employment trends since 2011 and no change in wage trends since 2014 so the case for responsibility is clearly tenuous.

Alliances in a two-party system tend to be fragile since winning with a smaller constituency enables better serving that constituency. Losing the populist angle leaves a double-down on the remaining agenda as the most plausible choice. Xenophobia is much older than democracy and psychologically potent so it has obvious value. It’s historically used by leaders who pick some characteristic to divide people and position themselves to thrive on the conflict or distraction that creates. Almost anything will do—if you take away religion, birthplace, skin color, and ethnicity, it would just change to hair color, nose size, or left-handedness. In a democracy, the goal with this approach is simply convincing people to vote according to their activated xenophobia.

For people embracing xenophobia to retain power, stochastic terrorism is just an unfortunate side effect. In this sense, inciting xenophobia about a caravan of refugee Guatemalans at the other end of Mexico is rather clever since most of them won’t even make it to the US border months after the election plausibly leaving only electoral consequences. Yet xenophobia is known to be hard to control. Given this, it’s difficult to imagine stochastic terrorism as anything other than deliberately accepted by the Republican party leadership as an observed consequence of this behavior. The Squirrel Hill massacre and the attempted bombing campaigns are precisely the sort of thing that can happen when you dial up the rhetoric just before an election.

This is part of a pattern of moral collapse across the Republican party. By any reasonable measure Donald Trump is a serial liar with Republican politicians now mimicking this behavior. A remarkable set of people around the Trump campaign are confessed or convicted criminals with members of the Republican party variously tolerating, condoning, and perhaps mimicking.

In this context, the upcoming midterm election seems particularly important. If politicians in aggregate behave as if they will do anything to get reelected, then voters must vote for the behavior they want at the ballot box rather than relying on or appealing to it at a later date. In most situations, this is about picking and choosing the better candidate. I’ve been registered as an independent for this reason—I want to decide for myself.

This is not most situations. Do voters rebuke the Republican party or not? If the answer is not (a 37% chance according to bettors at present) then the slide into corruption likely accelerates as confirmed control of the government erodes the remaining institutional checks on corruption. We are several steps away from a state of deep corruption and it takes time for the consequences of corruption to really seep into society. But every step on the path makes the situation worse and we are on the wrong path now as evidenced by bombing attempts, a xenophobic massacre, and the wider context creating them.

I want to particularly encourage those who are eligible to vote in the United States midterms November 6th.

Categories: Blogroll

Vizualisations of Political Hate Speech on Twitter

Recently there's been some media interest in our work on abuse toward politicians. We performed an analysis of abusive replies on Twitter sent to MPs and candidates in the months leading up to the 2015 and 2017 UK elections, disaggregated by gender, political party, year, and geographical area, amongst other things. We've posted about this previously, and there's also a more technical publication here. In this post, we wanted to highlight our interactive visualizations of the data, which were created by Mark Greenwood. The thumbnails below give a flavour of them, but click through to access the interactive versions.
Abusive RepliesSunburst diagrams showing the raw number of abusive replies sent to MPs before the 2015 and 2017 elections. Rather than showing all candidates, these only show the MPs who were elected (i.e. the successful candidates). These nicely show the proportion of abusive replies sent to each party/gender combination but don't give any feeling per MP the proportion of replies which were abusive. Interactive version here!
Increase in AbuseAn overlapping bar chart showing how the percentage of abuse received per party/gender by MPs has increased between 2015 and 2017. For each party/gender two bars are drawn. The height of the bar in the party colour represents the percentage of replies which were abusive in 2017. The height of the grey bar (drawn at the back) is the percentage of replies which were abusive in 2015 and the width shows the change in volume of abusive replies (i.e. the width is calculated by dividing the 2015 raw abusive reply count by that from 2017 to give a percentage which is then used to scale the width of the bar). So height shows change in proportion, width shows increase in volume. There is also a simple version of this graph which only shows the change in proportion (i.e. the widths of the two bars are the same). Original version here.
Geographical Distribution of AbuseA map showing the geographical distribution of abusive replies. The map of the UK is divided into the NUTS 1 regions, and each region is coloured based on the percentage of abusive replies sent to MPs who represent that region. Data from both 2015 and 2017 can be displayed to see how the distribution of abuse has changed. Interactive version here!
Categories: Blogroll

How difficult is it to understand my web pages? Using GATE to compute a complexity score for Web text.

The Web Science Summer School, which took place from 30 July - 4 August at the L3S Research Centre in Hannover, Germany, gave students a chance to learn about a number of tools and techniques related to web science. As part of this, team member Diana Maynard gave a keynote talk about applying text mining techniques to real-world applications such as sentiment and hate speech detection, and political social media analysis, followed by a 90 minute practical GATE tutorial where the students learnt to use ANNIE, TwitIE and sentiment analysis tools. The keynotes and tutorials throughout the week were complemented with group work, where the students were tasked with the question: “Can more meaningful indicators for text complexity be extracted from web pages ?”. Here follows the account of one student team, who in the space of only 4 hours, managed to use GATE to complete the task – an extremely creditable performance given their very brief exposure to GATE.

After some discussion, our team decided to focus on a very practical problem: the readability metrics commonly used to assess the difficulty of a text do not account for the target audience or the narrative context. We believed a simple approach employing GATE could offer greater insights into how to identify the relevant features associated with text complexity. Everyone had an intuitive understanding of text complexity; it was when trying to match these ideas into an objective framework that issues arose. Particular definitions on complexity, understandability, comprehensibility, and readability were mixed and matched when approaching this issue. 
In our team vision, the complexity of a document is based not only on the structure of the sentences but also in the context of its narrative and the ease with which the targeted audience can understand it In our model, the complexity score of a text is linked to the context of the text’s narrative. This means texts about certain narrative contexts (topics) are inherently harder to understand than other texts. How hard it is to understand a particular text is also related to the capabilities of the reader. Thus, texts on specific narrative contexts can be characterized to create a score of how hard to understand they will be for certain audiences. 
To do this, we proposed the following process: 
  1. Create an instance lexicon for content complexity
    • Collect a set of texts from different narrative contexts that the audience may be expected to read, e.g. celebrity news, political news, sports news, medical information leaflets, coursebook fragments.
    • Identify the relevant entities in those texts, i.e. persons, locations, organizations, percentages, dates, and technical terms
    • Assess the complexity of each text by using crowdsourcing, e.g. have a sample of UK young adults assess the difficulty of the texts via ratings or procedures like CLOZE. 
    • Assign a complexity value to each entity in the lexicon based on the complexity values of the text it appeared in and its relevance to those texts.
  2. Assess the complexity of new texts
    • Identify the relevant entities in the text
    • Employ the entity complexity lexicon to compute an estimate value for the new text.
During the allocated time, our team completed the first stage by creating an entity lexicon. We employed GATE to identify entities within a 11-webpage corpus.

Running the Term-Raider plugin to identify the entities in the texts. The corpus was composed by 9 Wikipedia pages and 2 academic articles, and an independent scoring (1-10 scale) of the pages’ complexity was given by 4 team reviewers. Then, the entities were identified for each document by running the ANNIE and TermRaider plugins in the GATE GUI. 

Employing ANNIC to search for entities linked to organizations, locations, persons, dates or percentages within the texts. These entities were given a complexity score by computing the average complexity of the pages they appeared in. We obtained a set of 5312 entities which were exported to an xml file. 
Result extract exported in XML format from the TermRaider plugin Once duplicates had been accounted for, our lexicon was composed of 906 weighted pairs. 
Extract from the named entity lexicon after adding the weights based on the complexity scores of the pages they appeared inThis lexicon was used to calculate a complexity for a new page set, which showed significant divergence from the base (readability) score we were given at the start. 
Comparison of the scores assigned by the lexicon (1-10) and the complexity score given to us as a base (0-1)

In general, the entities in a text are associated with the text narrative contexts, e.g. celebrity news will include celebrity names and places, while scientific literature will reference percentages, ratios and error estimates. In our model, an annotation of the complexity of a sample of pages from several narrative contexts could be used to determine a complexity value for relevant entities based on the complexity scores of the pages in which it appears, which can then be used to estimate complexity scores for new pages. 
Given the time constrains we had, many of the activities were done based on naïve algorithms and within the limits of our resources. We have some further ideas on how this approach could be further explored. First, we believe that any complexity score should take into account the audience capability. In this case, the researcher should appreciate that determining the characteristics of the population they wish to explore is just as important as determining the narrative context and structure of the text. Asking teenagers to read mathematical formulae will yield different complexity scores from if the audience were GPs or older adults. 
An objective way of scoring the complexity of a text is the use of comprehension testing process like CLOZE, where every 5th word is replaced with a blank space which respondents are asked to then fill. Such a procedure can be used in crowdsourcing platforms like Mechanical Turk to create complexity lexicons for specific audiences: sample texts of diverse narrative contexts (topics) would be selected to be assessed by the crowd, which would tell us in how complex particular groups of people find certain texts (e.g. UK teenagers find maths text really difficult and tweets easy, but the complexity scores may reverse for older Mexican maths professors when given the same texts).
Another aspect that could be easily improved is the use of centrality metrics like TextRank to determine which named entities are actually relevant to the text, based on their frequency and position within the narrative. Finally, a ranking algorithm like Page-Rank can be adapted to obtain the complexity scores of the entity lexicon in a way that permits to identify relevant entities by employing clustering algorithms. 
Team: Damianos Melidis, L3S Hannover, GermanyLatifah Alshammari, University of Bath, UKFernando Santos Sanchez, University of Southampton, UKAhmed Al-Ghez, University of Goettingen, GermanyFatmah Bamashmoos, University of Bristol, UK

Slides from the group presentationp.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 10.0px Tahoma; color: #4787ff; -webkit-text-stroke: #4787ff} span.s1 {font-kerning: none; color: #000000; -webkit-text-stroke: 0px #000000} span.s2 {text-decoration: underline ; font-kerning: none; -webkit-text-stroke: 0px #4787ff} p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px 'Times New Roman'; -webkit-text-stroke: #000000} span.s1 {font-kerning: none}
Categories: Blogroll

Students use GATE and Twitter to drive Lego robots—again!

At the university's Headstart Summer School in July 2018, 42 secondary school students (age 16 and 17) from all over the UK (see below for maps) were taught to write Java programs to control Lego robots, using input from the robots (such as the sensor for detecting coloured marks on the floor) as well as operating the motors to move and turn.  The Department of Computer Science provided a Java library for driving the robots and taught the students to use it.

After they had successfully operated the robots, we ran a practical session on 10 and 11 July on "Controlling Robots with Tweets".  We presented a quick introduction to natural language processing (using computer programs to analyse human languages, such as English) and provided them with a bundle of software containing a version of the GATE Cloud Twitter Collector modified to run a special GATE application with a custom plugin to use the Java robot library to control the robots.

The bundle came with a simple "gazetteer" containing two lists of keywords:

leftturnleftturnporttakemakemove
and a basic JAPE grammar (set of rules) to make use of it.  JAPE is a specialized programming language used in GATE to match regular expressions over annotations in documents, such as the "Lookup" annotations created whenever the gazetteer finds a matching keyword in a document. (The annotations are similar to XML tags, except that GATE applications can create them as well as read them and they can overlap each other without restrictions.  Technically they form an annotation graph.)



The sample rule we provided would match any keyword from the "turn" list followed by any keyword from the "left" list (with optional other words in between, so that "turn to port", "take a left", "turn left" all work the same way) and then run the code to turn the robot's right motor (making it turn left in place).

We showed them how to configure the Twitter Collector, follow their own accounts, and then run the collector with the sample GATE application.  Getting the system set up and working took a bit of work, but once the first few groups got their robot to move in response to a tweet, everyone cheered and quickly became more interested.  They then worked on extending the word lists and JAPE rules to cover a wider range of tweeted commands.

Some of the students had also developed interesting Java code the previous day, which they wanted to incorporate into the Twitter-controlled system.  We helped these students add their code to their own copies of the GATE plugin and re-load it so the JAPE rules could call their procedures.

We first ran this project in the Headstart course in July 2017; we made improvements for this year and it was a success again, so we plan to include it in Headstart 2019 too.
The following maps show where all the students and the female students came from.


This work is supported by the European Union's Horizon 2020 project SoBigData (grant agreement no. 654024).  Thanks to Genevieve Gorrell for the diagram illustrating how the system works.
Categories: Blogroll
Syndicate content