Categories
Interviews

Interview: How Lexikat is making Natural Language Processing (NLP) faster and more accurate.

Find out how Vox Dei (also known as Lexikat) is making Natural Language Processing (NLP) faster and more accurate.

YS: Today, we have Lexikat’s Jennifer, Junjie, Shaorong. Founded by a group of researchers from the Lee Kuan Yew School of Public Policy, they are a spin-off from the National University of Singapore’s highly acclaimed Graduate Research Innovation Programme (GRIP).

They will shortly be launching their first product, a software package which will enable researchers and analysts to reduce their time spent on analysing and data processing by more than 65% compared to traditional methods.

1. I’ll start first and ask the obvious question. Tell us the story behind how you come up with the idea for 《Lexikat》

Jennifer: My Ph.D. research at the National University of Singapore was in public policy. It required us to process a lot of text data, which took a lot of time and energy. The existing software for doing this gives such a poor user experience that we didn’t want to use it at all, and the different packages either produce unsatisfactory results or are extremely difficult to use. That’s why we thought of developing a tool that makes text analysis simple while being reliable and accurate. That’s where the idea for Lexikat came from.

Junjie: I was in the financial industry for several years. We would process a large number of financial reports and announcements from listed companies every day. While working as a research assistant at NUS, I also needed to process a large number of texts, which left me with a good understanding of analytical algorithms.

Jennifer: At the time NUS was trying to encourage entrepreneurship, so Junjie and I founded a company called Vox Dei to commercialise our software. Our main product is Lexikat, which is intended to make text analysis easier. The beta version is already online, and we are constantly optimizing the algorithm and the front and back end. We are looking forward to putting the full business edition on the market as soon as possible so that more people in need can use Lexikat.

2. For most of us who invest in US and China, it is not hard to believe that there is huge volume of data to be process. When you were starting off in the NUS GRIP programme, surely you would also have look into your competitor landscape and their unique selling point. What insights did you gather back then (that Lexikat offers significant advantage over competitors)? What have change after GRIP?

Jennifer: The NUS GRIP programme requires startup projects to have original IP and market potential. In order to check whether there was a real market for our product, we did a lot of research work.

Existing software on the market can be divided into two main types: QDA (qualitative data analysis) and NLP (natural language processing), these two kinds of software have distinct advantages, but also have their own shortcomings.

Using QDA to process the data, you have complete control over the outputs, but you have to do everything by hand so the processing speed is very slow. Although NLP is very fast, it’s hard to use – it not only requires that the user have a certain level of programming skills, but it’s also difficult or impossible to customise the outputs. Lexikat provides the advantages of both, with none of the disadvantages.

Not only is it easy to use, but users can customise the outputs. More importantly, it produces fast outputs and the algorithm gives outstandingly human-like results, so it’s filling a gap on the market. Because of this, our project successfully got through the GRIP programme evaluations and received investment from NUS. GRIP gave us the starting capital and provided the basis for us to form a team, develop products and promote our software.

Not only that, GRIP also provided us with opportunities to talk to potential investors and customers, and give us advice. Importantly, GRIP also gave us a huge reputation boost. If we were only a small independent startup, it would be far more difficult to gain the trust of the market. As an NUS-backed company, we have much more credibility, which is important for sales and hard to get if you are a startup.

3. What makes this time so perfect for your start-up to kick off?

Shaorong: The text analytics market has always existed, and with the growth of big data analysis it has become more and more important, but the solutions on the market today are not particularly satisfactory. This is why Lexikat has a significant market value, and why this was recognised by NUS GRIP and private investors.

We are confident that the market will feel the same way. According to market research data, the global market for NLP currently exceeds 6 billion U.S. dollars, with an annual growth rate of more than 20%. The main buyers are universities, market research and consulting companies, where the demand is rapidly increasing.

The software already supports English data, which not only covers most academic needs but can also be used to connect with major global social media platforms (Twitter, Facebook, etc.).

Our first algorithm was based on Chinese, though, because China has huge market demand. Jennifer’s original research data was mainly in Chinese, and Junjie is from China. We have made several trips to Shanghai, Suzhou and Hangzhou to look into opening a China office. The Chinese market is very different from the English-language one, but we believe that our products can meet its needs. We are rapidly developing and improving our products, and look forward to opening up a larger market as soon as possible.

4. As you know, SSII is very happy to be your partner expanding into China. We love the fact that you are building something that is simple to use and making work much easier for many analysts and researchers.

It is not hard to imagine that given the large talent pool of researchers and analysts in China, that your business could expand not just in China but also the United States in the future.

Share a bit about what you think about that, even when that is something that would be appropriate for Lexikat to embark on?

Jennifer: The market for text analytics is huge. There are software vendors like IBM and Microsoft in there, but we have our own entry point. We were able to use NUS resources in the development process, and Lexikat was originally designed for university research. This is why the first market segment we’ll be targeting will be colleges and research institutions. Many academic users have already tried our beta software and gave us good feedback and many valuable suggestions. Lexikat has received advance sign-ups from more than 90 professionals from 13 countries, mainly from universities and research institutes. They will be our beachhead market when Lexikat officially goes online. We are confident that we can help them solve practical problems and expand into other markets from there.

Lexiktat has also been used in-house in several consulting projects that we have done for public and private organizations, letting us bring in revenue while testing the software. From this we can tell that there are a wide range of applications for the software in market research, consulting and other similar industries. We support Chinese and English language uploads, serving the world’s largest audience. The Chinese market is very large – not only are there a large number of universities, but industries such as market research and consulting are booming. Being able to win Chinese customers is very important for our future development. At present, our we have relationships with well-known Chinese universities such as Fudan University and Zhejiang University.

From the first tests, the Chinese algorithm produced very human-like outputs, so we are confident that we can satisfy Chinese customers. We plan to establish a Chinese office to develop and serve the Chinese market. Moreover, because of the way the software is designed, it is relatively easy for us to add in additional languages. We already have a Japanese demo and are working towards a Bahasa Indonesia version.

5. Share with us one case study of the work that you have done and why do you like that project so much?

Jennifer: Our customers come from many countries and industries, but currently they are divided into three main categories. The first category is Lexikat’s subscription customers; we have had over 90 advance subscriptions from 13 countries, including from the National University of Singapore, Fudan University, Peking University, and the University of Toronto.

The second type is customers who need customized software or algorithms because they have specific needs we can provide them with customized APIs. Sometimes they need specific algorithms – we made one for analysing the sentiment in cryptocurrency news headlines, for example – and sometimes they have particular data protection requirements. That would be the case for the Institute of Policy Studies at the Lee Kuan Yew School, for example; because they handle government data, they need to do a lot of work offline.

The third type of client are consultancy clients. We use Lexikat to process their data in and produce reports for them. These clients include Singapore’s Ministry of Culture, Community and Youth, The Motion Picture Association, King ’s College London and the Independent.sg.

One example would be the “Cars, Condos and Cai Png” report that we made in cooperation with the Lee Kuan Yew School of Public Policy. This project investigated Singaporeans’ views on social class; we collected open-ended online survey data and analysed the outputs. The aim was to produce a viral news story, and immediately after the article was published it provoked a lot of debate. It was covered in Chinese and English media such as Lianhe Zaobao and Today, generating a lot of publicity and bringing in new customers.

6. What has been the most surprising thing about your entrepreneurial journey so far?

Jennifer: Starting a business has been an interesting experience; I expected it to be a lot more stressful than it was. Even though our development process was affected by the coronavirus epidemic, it still hasn’t been as difficult as I was expecting – I guess it’s a reflection of how good the Singapore business environment is.

Junjie: I was amazed when Lexikat went online for the first time. It was both a nervous and an exciting moment because I didn’t know what would happen, I was worried about unexpected bugs and even worried that it would not work at all. We got the analysis results and they were great. After that, we improved the algorithm again and again to solve various technical problems. I hope Lexikat’s users will be happy with the algorithm and the website.

7. What is the best way to contact you?

Jennifer: We are launching our beta soon for testing. You may sign up here at https://voxdei.io/free-signups.

You can also reach out to me via my mobile:0065 91330611 or email me at Jen[email protected]

Alternatively, visit our website for more information: https://Voxdei.io or https://Lexikat.com

And there is also the avenue to reach out to our partner, SSII, to find out more information as well.