- Published on
How to correctly split a string into words in JavaScript
- Authors
- Name
- Nico Prananta
- Follow me: @2co_p
If you have been using .split(' ')
to get an array of words from a string, you should stop doing so. It works perfectly for English, but it's not the best solution for other languages.
For example, if you have a string like this:
const str = '俺はルフィ!海賊王になる男だ!'
console.table(str.split(' '))
you will get this:
![](/_next/image?url=%2Fstatic%2Fimages%2Farticles%2Fsplit-japanese.png.webp&w=3840&q=75)
which is not what we want. That's the whole sentence.
Instead, we should use the Intl.Segmenter!
const str = '俺はルフィ!海賊王になる男だ!'
const segmenterJa = new Intl.Segmenter('ja-JP', { granularity: 'word' })
const segments = segmenterJa.segment(str)
console.table(Array.from(segments))
![](/_next/image?url=%2Fstatic%2Fimages%2Farticles%2Fintl-segmenter.png.webp&w=3840&q=75)
We can also use { granularity: 'sentence' }
to get the sentences in the string.
![](/_next/image?url=%2Fstatic%2Fimages%2Farticles%2Fintl-segmenter-sentence.png.webp&w=3840&q=75)
By the way, I'm making a book about Pull Requests Best Practices. Check it out!
Did you like this post?
I'm looking for a job as full stack developer. If you're interested, you can read more about me here.