nico.fyi
    Published on

    How to correctly split a string into words in JavaScript

    Authors

    If you have been using .split(' ') to get an array of words from a string, you should stop doing so. It works perfectly for English, but it's not the best solution for other languages.

    For example, if you have a string like this:

    const str = '俺はルフィ!海賊王になる男だ!'
    console.table(str.split(' '))
    

    you will get this:

    which is not what we want. That's the whole sentence.

    Instead, we should use the Intl.Segmenter!

    const str = '俺はルフィ!海賊王になる男だ!'
    const segmenterJa = new Intl.Segmenter('ja-JP', { granularity: 'word' })
    
    const segments = segmenterJa.segment(str)
    console.table(Array.from(segments))
    

    We can also use { granularity: 'sentence' } to get the sentences in the string.


    By the way, I'm making a book about Pull Requests Best Practices. Check it out!

    Did you like this post?

    I'm looking for a job as full stack developer. If you're interested, you can read more about me here.