Open source tool aims to provide smart textual analysis

text-analysisRecently I looked at a couple of platforms that aimed to locate the latest research in particular fields.  The automated systems used some clever technology to analyze the text contained in papers to judge the merits as well as the content of each paper for the user.

I have some doubts over the sense making capabilities of the various platforms on the market, but what is not in doubt are the text analyzing capabilities currently out there.

A good example of this is a tool recently developed by researchers at USC.  The tool, called TACIT (Text Analysis, Crawling and Interpretation Tool), is an open source tool for gathering, managing and analyzing text.

“Currently [text analysis] techniques are available as independent programs or software, but they require a lot of expertise and because social scientists often don’t have the programming background, they don’t use them,” the team say. “So we’ve created a very researcher-friendly environment where they can easily access and use these methods. And if they want more, anyone can write their own plugins for the system.”

Smart text analysis

The tool utilizes a number of techniques by which sense can be made of a piece of text.  What’s nice about it is the open source nature of the software, so that other developers can easy develop plugins to extend its functionality.

The software comes with three core components:

  1. a crawler to allow text to be captured from a range of online sources
  2. a corpus management feature to process and store bodies of text
  3. an analysis tool to count instances and ratios of words

The software is due for a beta launch this month, with a final release due to go live in March 2016.  The team are confident in the demand for the solution however.

“In the first week the program launched,” they say, “there were over 2,500 hits to our website. We had people from Kenya to Vietnam, from Uruguay to Estonia and from Hawaii to Maine downloading the software.”

The last few years have seen tremendous gains in language processing, with innovations such as Siri achieving high levels of competence.

The TACIT team hope that their own system can take things on one step further.  It will be an interesting project to follow.

Related

Facebooktwitterredditpinterestlinkedinmail

Leave a Reply

Your email address will not be published. Required fields are marked *

Captcha loading...