Tag Archives: natural language processing

Two Projects in Text Data Mining and Natural Language Processing

Two Projects in Text Data Mining and Natural Language Processing

ELENA FILATOVA

Department of Computer Systems Technology, New York City College of Technology, City University of New York

 

In this presentation I will describe two projects I am working on: Automatic Sarcasm Detection and Information Assymetries in Multilingual Wikipedia.

Sarcasm detection: Humans are good at identifying sarcasm in text and speech. Can we teach a computer to identify sarcasm? Is it possible to point out the parts of the review that make it sarcastic? To answer these questions I use a corpus of sarcastic and regular Amazon product reviews. I analyze the sentiment flow of these reviews and demonstrate that classification features based on sentiment flow can be used to reliably classify documents into sarcastic and non-sarcastic.

Multilingual Wikipedia: Wikipedia is currently used as THE source of information without doubting the quality of this information. However, the Wikipedia articles corresponding to the same entry (person, location, event, etc.) written in different languages have substantial differences regarding what information is included in these articles. I discuss the nature of information assymetries in Multilingual Wikipedia and outline my plan for using information assymetries for automatic extension of Wikipedia articles.

Bio: Dr. Filatova is an Assistant Professor in the Computer Systems Technology department at CUNY CityTech since Fall 2015. Prior to that she was a faculty member at the Forhdam CIS department. She received her Ph.D. in Computer Science from Columbia University in 2008

Two Projects in Text Data Mining and Natural Language Processing

Two Projects in Text Data Mining and Natural Language Processing

ELENA FILATOVA

Department of Computer Systems Technology, New York City College of Technology, City University of New York

FEBRUARY 25 @ 12:00 PM1:00 PM in N928

In this presentation I will describe two projects I am working on: Automatic Sarcasm Detection and Information Assymetries in Multilingual Wikipedia.

Sarcasm detection: Humans are good at identifying sarcasm in text and speech. Can we teach a computer to identify sarcasm? Is it possible to point out the parts of the review that make it sarcastic? To answer these questions I use a corpus of sarcastic and regular Amazon product reviews. I analyze the sentiment flow of these reviews and demonstrate that classification features based on sentiment flow can be used to reliably classify documents into sarcastic and non-sarcastic.

Multilingual Wikipedia: Wikipedia is currently used as THE source of information without doubting the quality of this information. However, the Wikipedia articles corresponding to the same entry (person, location, event, etc.) written in different languages have substantial differences regarding what information is included in these articles. I discuss the nature of information assymetries in Multilingual Wikipedia and outline my plan for using information assymetries for automatic extension of Wikipedia articles.

Bio: Dr. Filatova is an Assistant Professor in the Computer Systems Technology department at CUNY CityTech since Fall 2015. Prior to that she was a faculty member at the Forhdam CIS department. She received her Ph.D. in Computer Science from Columbia University in 2008

DETAILS

Date:
February 25
Time:
12:00 pm – 1:00 pm
Event Category:
Event Tags:
, , , , ,

VENUE

N928
300 Jay St., Room N928
Brooklyn, NY 11201 United States
+ Google Map
Phone:
718-260-5170
Website:
http://www.citytech.cuny.edu/academics/deptsites/cst

ORGANIZER

Computer Systems Technology Colloquium Series
Phone:
(718) 260-5170
Email:
Website:
https://openlab.citytech.cuny.edu/cstcolloquium

More than Words: Advancing Prosodic Analysis

Computer Systems Technology Colloquium Series presents:
More than Words: Advancing Prosodic Analysis
Andrew Rosenberg

COMPUTER SYSTEMS TECHNOLOGY
NEW YORK CITY COLLEGE OF TECHNOLOGY,
CITY UNIVERSITY OF NEW YORK
300 JAY ST.
BROOKLYN, NY 11201

THURSDAY, FEBRUARY 5, 2015 12-1PM
ROOM N906
LIGHT REFRESHMENTS WILL BE SERVED!

Prosody is an essential component of human speech. Prosody, broadly, describes all of the production qualities of speech that are not involved in conveying lexical information. Where the words are “what is said”, prosody is “how it is said”. Prosody of speech, plays an important role not only in communicating the syntax, semantics and pragmatics of spoken language, but also in conveying information about the speaker and their internal state (e.g. emotion or fatigue).

Understanding prosody is critical to understanding speech communication. Spoken language processing (SLP) technology that approaches human levels of competence will necessarily include automatic analysis of prosody. Despite the importance of prosody in spoken communication, researchers are often unable to reliably incorporate prosodic information into applications. One explanation is a lack of compact, consistent, and universal representations of prosodic information. This talk will describe the state of the art in prosodic analysis and its use in spoken language processing with a focus on the development of new representations of prosody.

Andrew Rosenberg is an Assistant Professor of Computer Science at Queens College (CUNY) and a member of the Doctoral Faculty of the Computer Science and Linguistics programs at the CUNY Graduate Center. He completed his Ph.D. at Columbia University in 2009. Dr. Rosenberg leads the Speech Lab @ Queens College and is a NSF CAREER Award winner. His research concerns Natural Language Processing, Spoken Language Processing, Prosody/Intonation and Machine Learning. He also collaborates part time at the IBM TJ Watson Research Lab, where he helps improve the speech synthesis quality for Watson, the Jeopardy! playing system.

Poster

More than Words: Advancing Prosodic Analysis

Computer Systems Technology Colloquium Series presents:
More than Words: Advancing Prosodic Analysis
Andrew Rosenberg

Computer Systems Technology
New York City College of Technology, City University of New York
300 Jay St.
Brooklyn, NY 11201

Thursday, February 5, 2015 12-1pm
Room N906
Light refreshments will be served!

Prosody is an essential component of human speech. Prosody, broadly, describes all of the production qualities of speech that are not involved in conveying lexical information. Where the words are “what is said”, prosody is “how it is said”. Prosody of speech, plays an important role not only in communicating the syntax, semantics and pragmatics of spoken language, but also in conveying information about the speaker and their internal state (e.g. emotion or fatigue).

Understanding prosody is critical to understanding speech communication. Spoken language processing (SLP) technology that approaches human levels of competence will necessarily include automatic analysis of prosody. Despite the importance of prosody in spoken communication, researchers are often unable to reliably incorporate prosodic information into applications. One explanation is a lack of compact, consistent, and universal representations of prosodic information. This talk will describe the state of the art in prosodic analysis and its use in spoken language processing with a focus on the development of new representations of prosody.

Andrew Rosenberg is an Assistant Professor of Computer Science at Queens College (CUNY) and a member of the Doctoral Faculty of the Computer Science and Linguistics programs at the CUNY Graduate Center. He completed his Ph.D. at Columbia University in 2009. Dr. Rosenberg leads the Speech Lab @ Queens College and is a NSF CAREER Award winner. His research concerns Natural Language Processing, Spoken Language Processing, Prosody/Intonation and Machine Learning. He also collaborates part time at the IBM TJ Watson Research Lab, where he helps improve the speech synthesis quality for Watson, the Jeopardy! playing system.

Poster