Key points are not available for this paper at this time.
Researches on Indonesian named entity (NE)tagger have been conducted since years ago but without using deep learning. Most researches employed traditional machine learning algorithms such as association rule, support vector machine, random forest, naïve bayes, etc. In those researches, the word lists as gazetteers or clue words are provided to enhance the accuracy. Here, we attempt to employ deep learning in our Indonesian NE tagger. We use long short-term memory (LSTM)as the topology since it is the state-of-the-art of NE tagger. By using LSTM, we don't need a word list in order to enhance the accuracy. Basically, there are two main things that we investigate. First is the output layers of the network: Softmax vs conditional random field (CRF). Second is the usage of part of speech (POS)tag embedding input layer. Using 8400 sentences as the training data and 97 sentences as the evaluation data, we found that POS tag embedding as the input layer improved the performance of our Indonesian NE tagger. As for the comparison between Softmax and CRF, we found that both architectures have a weakness in classifying an NE tag.
Building similarity graph...
Analyzing shared references across papers
Loading...
Devin Hoesen
Ayu Purwarianti
Bandung Institute of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Hoesen et al. (Thu,) studied this question.
www.synapsesocial.com/papers/6a07fd94eced9cc596fe0921 — DOI: https://doi.org/10.1109/ialp.2018.8629158