name

Naming and renaming columns in R dataframes



Sharing buttons:

hello and welcome to our lab today we're

going to be talking about naming and

renaming columns in our data frames the

simple function that we used to do this

is names and this tutorial will mainly

focus on different ways that you can

index data frame columns to more

efficiently apply the names to either a

whole variety of columns a specific

subset of columns or one column in

particular we'll also introduce you to a

couple of other functions like pastes

jisub and row names that you might find

useful as well okay so let's switch over

to our studio and let's create our own

data with really messy column names so

here we've got ah my data which has got

some three columns by three rows let's

give it some row names as well so and be

mcat for instance let's views this data

so this is very a very small data set so

it's quite easy just a print list to the

console if we use the names function on

my data it will output the number of

elements that there are columns in a

character form not only can we output

and print the names of our columns of

this specific data frame to the console

but we can then assign new set of

characters to each one of those column

names this is probably the command that

you can to use the most it goes through

all the columns that so in our case

three columns and renames them

characters that are more preferable now

in this case we've still left

inconsistencies between these column

titles this is so we can then go on and

show some different types of indexing

and keep changing up the data to

converge on something that really is

tidy sometimes it might be the case that

you just want to change one specific

column title and do this you can use the

names function on the data and then

after you've after you close the bracket

our

on the data you can specify the column

number so here we do that and we're

reassigning it a new name

Cory so we have what we use previously

the names are calling a my data and with

this time asking it to look at just the

third column if you print this

yes we've changed this third column but

the rest have stayed the same in the

case whether many many columns it can be

difficult to keep track of what name and

therefore attribute a column is and how

that relates to the column number in

this case a call like this becomes

difficult because you can't remember

which column represents the net the new

name you're wanting to assign to it to

get round this instead of indexing just

a number you can index its previous name

so here we've got we're calling names of

my data we're indexing to the name of my

data that is currently equal to call two

worth to a spelt out and we're

reassigning this a new character and if

you run names function yes this is

starting to look a little bit neater

sometimes in column names there can be

systematic differences between between

how different people have labeled the

titles of columns these two columns have

dashes whereas this one has a stop it

can be useful to make substitutions of

multiple characters for another

character allowing you to make a quicker

change to all column names to do this we

can use the G sub function so here we've

got G sub it takes three inputs the

first input is the character of what

we're wanting to change so in this case

dashes then it takes the second input is

the the character that we want to

replace that with

finally it says what character are

string do you want to make these changes

to and in this case we pass into it we

that we want to make the changes to the

previous list of characters for our

column names and reassign this to names

of my data and yes we can see that now

we have a consistency between our column

names with these dots now appearing

whether was previously dashes and we've

done this in just one line of code

before I go on to talk briefly about row

names and setting them to a column

variable I just want to mention one of

the common errors that can occur when

reassigning characters if R is given

smart quotation marks it will throw an

error all good editing packages for our

scripting won't input smart quotations

they'll put in quotation marks that are

useful for our if you're using a strange

scripting editor then it may input smart

quotations other ways to accidentally

get smart quotation marks in your code

is if you copy and paste code via a word

or directly from the web so beware the

last thing I want to mention is the

difficulty that can arise from having

row names the general tidy data

principle is that row names are bad this

is because the this data is hidden away

way and outside of the main data frame

and hasn't got column associated with it

allowing you to set it as a factor in

your for coming artists so you may ask

if I was given a data like this how am I

get round this for this problem well we

can call the row names by using the row

names function on our data frame but

this just prints the console all our row

names we can then take this this output

and then assign it to a

you column within our data frame so

we've got our data frame my data we're

going to index it columns in the dollar

sign we're going to create a new column

called species and then we're going to

assign the right row names to that

column so if now look at the data

some-something looking a bit more like

this you'll notice that in doing this

we've created an extra column but we

haven't removed the row row names that

we had before before so let's take a

step just to remove that using the null

function we've now got something that is

it a neat data set the final comment I

just want to leave you with is that

names is a really useful function that

can be called in many different ways but

there is inconsistencies as to how

different people set Ronet row names

some people use the call names function

there is no problem with using this it

does exactly the same thing and in fact

it works better in some cases because

you can apply columns to matrices as

well as data frames thanks listening I

hope you found this tutorial on naming

columns useful if so you might also be

interested in other videos that we have

in our playlists on on debt handling and

cleaning and reading data also you may

want to consider subscribing to our our

labs Channel