An ESL teacher in Saigon wrote me: “As you may have worked out already, the pronunciation of Vietnamese ESL learners is not great. I am looking at ways to try and improve the pronunciation of the learners at my school.”

“As a linguist, do you have any insights into spoken English and the difficulties that syllable-timed L1s (Vietnamese people) might have learning a stressed-timed L2 language?”

Pronunciation is always a problem for Asian students, but in my experience, having taught in a number of Asian countries, the Thais and Vietnamese seem to have the most problems with pronunciation. Chinese, Korean, or Khmer students have some consistent pronunciation problems, but they can make themselves understood. With Vietnamese, the pronunciation is often so far off that you have no idea what they are even trying to say.

When it comes to language learning, the Vietnamese are faced with several problems. At least two of which are unique to Vietnam, but the others seem to be consistent across Asia.

Let’s get the Asian-consistent problems out of the way first.

Listening: I am a proponent of ALG Automatic language Growth, a listening-only method of language acquisition. (You can watch some of my ALG Videos here).

Without going 100% into ALG or applying it exclusively to an ESL classroom in Vietnam, I believe, beyond any doubt, that a significant factor contributing to Vietnamese students having pronunciation issues is that they simply don’t listen enough. If you haven’t heard the sounds, you can’t reproduce them. In the commercial ESL marketplace, across Asia, parents are told that their children will be speaking English from their first day. The focus of the entire program is on speaking, rather than listening. Good foreign ESL teachers do model the target language, before asking students to produce it. But it’s not enough. When you learned English, you heard phrases hundreds or even thousands of times before you spoke them. But in Asian ESL, students are asked to produce after one or two hearings.

If you look on an ESL syllabus, obviously there are always listening exercises built into the curriculum, but they generally account for less than ten percent of class time. Production counts for the bulk of class time. This needs to be reversed, fifty minutes of listening and ten minutes of production would be a better ratio.

Along with the lack of listening in the programs, there is a cultural problem with listening. For whatever reason, it just seems that across Asia, listening skills, even in the mother tongue, are horrendous. It is particularly bad in Southeast Asia where, during a listening exercise, a student would think nothing of having a conversation with his neighbor or making a call on a cell phone.

Once again, if they don’t listen, they can’t learn the target language and won’t be able to reproduce it.

Culturally, there are a number of factors which adversely effect the Asian learner:

Face – students don’t want to make any mistakes because they could lose face.

Not wanting to stand out – in most of the cultures across the region, the culture calls for conformity and for people to fit into prescribed roles in the society or in the group. No one wants to stand out or innovate, even if it means giving the answer to a question. Students will generally wait until a number of brave souls have answered before they will answer. This is true of all societies to varying degrees. But in Asia, the goals in a group activity are consensus and harmony, not standing out or being exceptional, as say an American would try to do.

The Confucian education system, which is prevalent in Taiwan, Korea, Japan, and Vietnam, is based on rote learning. If we take Kung Fu as an example of the ultimate expression of Confusion learning: all of the Kung Fu movements which will ever exist, already exist. There will be no innovations and no additions. The best student is the one who most accurately copies his teacher and reproduces what the teacher does. In the days when people still fought with Kung Fu or used it for self-defense, the logic in the training was that the teacher would think of every possible attack situation the student would face. Actually, the teacher didn’t think of it, he learned it from his teacher, who learned it from his teacher. Then the student was taught one prescribed reaction to each situation he would face. So, the best fighter was the one who memorized the largest number of attacks and counters, because he had the highest probability of winning, no matter what attack came.

This type of logic is applied to all forms of pedagogy in Asia. Students are rewarded for copying their teacher. On an essay exam, the teacher expects to see the students reproduce, verbatim, his or her words, from the lecture. In America, a students would normally receive a very low grade if he dutifully repeated the teacher’s words, rather than thinking of an answer himself or herself.

In language learning, this method is also applied, but doesn’t work. Students are conditioned to react to very specific stimuli. And if you don’t ask the question, exactly as it is written in the book, there would be cultural barriers preventing the student from answering.

One day, my schedule called for my class to watch a DVD. I had the DVD, but I needed the player. I explained this to my Vietnamese co-teacher and asked her to “go get the DVD machine.” She had no idea what I was talking about. “The DVD machine. We need the DVD machine to watch the movie,” I told her. She left, and returned with a DVD. “NO, we have the DVD already. We need the machine,” I said. Then I stopped and remembered the exact verbiage. “I need the DVD player,” I said, and then everything was fine.

Obviously, language is a living breathing thing which will not follow rules established in a classroom. Also, there are over 400 million English native speakers from countless countries, on ALL of the continents. They won’t all speak the same way. But the Vietnamese education system only prepares the students to deal with people who just stepped out of a textbook.

Now, getting to the specific issues of pronunciation for Vietnamese students; Vietnamese is a Mon Khmer language. There are only two major Mon Khmer languages (Major meaning used as a national language). They are Vietnamese and Khmer. Vietnamese is tonal, whereas Khmer is not. But apart from the tones, the linguistic rhythms are quite similar. As for pronunciation, Mon Khmer languages have a very limited number of terminal sounds. In Khmer, I think there are only 8 possible sounds that can come at the end of a word. In Vietnamese, the number is a bit higher, but still much lower than English. This is significant because if the student’s mother tongue does not contain a certain sound, they can’t hear it in another language. Or, the sound may even exist in the mother tongue, but never as a terminal sound. So, once again, if that sound is used as a terminal sound in English, they don’t hear it.

When you hear the student speaking with tortured pronunciation, keep in mind that he is hearing something very similar to what is coming out of his mouth, which would explain why the students often don’t understand you.

I haven’t studied Vietnamese as deeply as I have Chinese. So I may be off here, but in Chinese, Chinese native speakers are not taught to recognize words by phonemes. They are taught to recognize words by tones. The tone is more important for conveying meaning than the phoneme. I would have to believe that to some degree this is the case in Vietnamese. It won’t be as severe as in Chinese because Mon Khmer languages have multi-syllabic words. Chinese is composed of single syllables, so the tones are probably more important to tell them apart. With Vietnamese, because of my personal approach to study, I see that 80% of Vietnamese vocabulary is composed of single syllable compound words, derived from Chinese. But I am not certain if the Vietnamese interpret or hear their own language this way.

I think we can say that the lack of tones in English becomes a factor in listening comprehension and sound reproduction. But I am not sure to what degree this is a problem for Vietnamese students.

When you begin to learn Vietnamese language, you see the Vietnamese alphabet, Quốc-ngữ, and think, “Oh this is easy. It looks like the Roman alphabet.” But then when you read aloud, no Vietnamese person can understand you. The reason, of course, is that although the characters used in Quốc-ngữ are derived from the Roman or Latin alphabet, Quốc-ngữ is not the Roman or Latin alphabet. The pronunciation of the letters is quite different. The pronunciation of combinations of letters differs from the pronunciation of the same letters pronounced separately. The pronunciation of letters occurring at the ends of words is often different than when those same letters appear at the beginning or in the middle of a word.

When a westerner learns Chinese or Thai, he has no presupposed notions of how any of the strange characters should be pronounced. So, he simply listens to the teacher (hopefully) and repeats, with no bias. But when a westerner learns Vietnamese, he has to unlearn his suppositions about the Vietnamese writing system. It takes a long time for most people to do this, and very few will do it 100%.

Obviously, for Vietnamese learning English, the same must be true. If, in his mind, he is applying Vietnamese sound values to the Roman alphabet, his reading will be unintelligible.

Generally, when I write a piece about a language, I send it to my teacher, David Long, the world’s leading expert on ALG. He will read an article like this and say words to the effect of, “Interesting article. You brought up some good points. But none of this matters.” The short answer is: if you want students to have native-like pronunciation, they need to listen for 800 hours. The more the students listen, the better their pronunciation will be. It is that simple.