LanternAI is a handwritten Chinese character and radical recognition program made for helping students learn Chinese! Written in Python, LanternAI utilizes machine learning to identify over 7,000 Chinese characters.
Chinese is the top language in the world with over 1 billion speakers! The Chinese written language is character based, with each character representing an idea. These characters (ideograms) evolved from simple drawings of the objects they represent (pictograms).
One of the main challenges for learners is memorizing how to read and write characters. LanternAI creates an easy and accessible way to breakdown Chinese characters and to better visualize the relationship between how words are written and their meaning and sound. Each character is broken down into components, with the main component being the word’s radical. Radicals give meaning and classification to the word, similar to prefixes.
We’ve trained an AI model to recognize these characters and radicals from images. First we started with preprocessing: organizing and then resizing our dataset of over 3 million images that we used to train the machine learning network. Then we trained the model by repeatedly running it and making minor adjustments to the code each time, testing to see what changes improved the model’s accuracy the most. Some adjustments we made included changing the filters, dense layers, kernel size, number of convolution layers, adding callbacks to prevent overfitting, and using a different optimizer. We found that changing the optimizer had the greatest impact on improving the model’s accuracy. Finally, made a drawing program and combined it with our character recognition model; this formed the prototype for LanternAI. We are currently in the process of connecting LanternAI with our trained radical recognition model and a dictionary so that it will be able to display the character’s radical and translated meaning in the results.