declare

How to split a Java String - 052



Sharing buttons:

Hi there!

In this tutorial we're going to look at how to split a string in Java.

We can split the string by character or split the string by words.

We'll look at both!

We're going to look at simple string manipulation because it's a common task.

Quite often we're asked to take a string and manipulate it character by character.

That means we need to know how to how to split a string character by character in Java.

Or more often we're asked to split a Java string representing many words, and we'll

need to look at it word by word.

The classic example for this is CSV files.

The CSV stands for "character separated values".

Sometimes you'll hear it called "comma separated values".

Either way, we need a method to split Java strings.

First let's look at splitting Java strings by character.

What that means is if we're given the word "hello", we want to access each character.

That's "h", "e", "l", "l", and "o".

We'll start by creating a string with the literal value "hello".

To get an array of characters from this string, we'll call the method toCharArray().

This takes our string, and returns an array of characters.

Let's assume we're asked to reverse a string, and for this exercise let's assume we don't

know about the reverse method in StringBuilder.

If we want to do it in a production program, we'll do something like this.

Let's do it manually, so we can get a feel for manipulating strings by character.

The first thing we want to do is convert our string to an array of characters.

Now we can't manipulate the string, since it's immutable, but we can move things in

place in our array.

Our plan is this.

We'll take the first character, and swap it in place with the last character.

We'll keep moving the index in our array, and swap with the last position minus our

index.

So 0 gets swapped with the last position, 1 gets swapped with the second to last position

and so on.

Now if we don't stop half way, we'll reverse the string and then reverse it again.

So we need to stop in the middle.

We'll get the last index, which is the length minus one since we're working with a zero

based array.

And the middle is just half of that.

Now we'll iterate over our array swapping characters.

Finally we'll make a string with our character array, and return it.

So if we start with "This is a test", the method will return "tset si sihT".

We can also access characters in a Java string using the charAt method.

The charAt method returns the character at the position specified in the Java string.

We could simplify our reverse method by iterating the string backwards, and add each character

to a empty character array.

This method doesn't need a swap, so it's a bit easier to read.

Now lets look at bigger things.

Let's split a Java string!

Assume we want to remove stop words from a sentence.

Stop Words are words which aren't significant for use in search queries.

To make our example simple, we want to remove the word "the", from the sentence "The quick

brown fox jumped over the lazy dog".

Our result should be "quick brown fox jumped over lazy dog".

To break up a Java string, we need to call the split method.

This splits the Java string into a string array.

We're passing a space character to the method, which is telling Java we want to split our

Java string by spaces.

Next we'll add each word that's not "the" to a StringBuilder.

If we had more stop words we'd test each word against a set, but for our simple example

we just want to manipulate the sentence one word at a time.

When we're done, we'll ask our builder to build a Java string and return it.

This removes our stop word from our sentence.

Going back to our original thought, how do we split a Java string by character, say a

comma or a pipe.

That's what we'd need for CSV parsing.

Say we have a long string representing a single line in a CSV file.

A comma separates every value in the string.

How would we split this Java string?

Well we'd do it almost the same way.

Remember in the Java string split method, we passed the method a blank space.

If we wanted to split the Java string using another character, we'd pass a different character.

So in this case, we'd pass a comma to the split method.

This would give us our CSV values in an array.

We can split our Java string on any character.

If you're looking closely, we're passing a string to the method.

Not a character.

The reason is we're passing a regular expression.

We'll cover regular expressions in depth later, but what we need to know now is some strings

won't work.

For example, if we need to split a Java string by a period or dot.

This gets a bit tricky.

Periods in regular expressions match any single character.

That's not what we want.

If we place a period inside the string and then try to split on it, it won't work.

In this case we need to escape the period like this.

The key point to note is this is a regular expression, not just matching a string.

Despite all this, we can pass a string to the method and split on the string.

We're just using an example as our regular expression, not a pattern.

For example in our stop words method, we could pass the stop word "the" to split our string,

and then just recombine it like this.

Of course it makes the spacing a bit wonky, but you get the idea.

In this case we're breaking the string into two, dividing on the word "the".

So that's how we split Java strings.

We can access individual characters either with the charAt() method, or splitting the

Java string into a character array.

And we can divide strings based on regular expressions, usually a simple string.

Hopefully you'll find this useful in your Java programming.

If you have any questions, add them to the comments below.

If you got this far, hopefully you liked the video.

If so, make sure you like, share and subscribe!

And with that, I’ll see you in the next tutorial!